Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Training of Isolated Word Recognizers with Continuous Speech

Reinhard Blasig, Georg Rose, Carsten Meyer

Philips Research Laboratories, Aachen, Germany

Is it possible to use out-of-domain acoustic training data to improve a speech recognizer's performance on a specific, independent application? In our experiments, we use Wallstreet Journal (WSJ) data to train a recognizer, which is adapted and evaluated in the Phonebook domain. Apart from their common language (US English), the two corpora di er in many important respects: microphone vs. telephone channel, continuous speech vs. isolated words, mismatch in speaking rate.

This paper deals with two questions. First, starting from the WSJ-trained recognizer, how much adaptation data (taken from the Phonebook training corpus) is necessary to achieve a reasonable recognition performance in spite of the high degree of mismatch? Second, is it possible to improve the recognition performance of a Phonebook-trained baseline acoustic model by using additional out-of-domain training data? The paper describes the adaptation and normalization techniques used to bridge the mismatch between the two corpora.

Full Paper

Bibliographic reference.  Blasig, Reinhard / Rose, Georg / Meyer, Carsten (2000): "Training of isolated word recognizers with continuous speech", In ICSLP-2000, vol.1, 449-452.