ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Training of isolated word recognizers with continuous speech

Reinhard Blasig, Georg Rose, Carsten Meyer

Is it possible to use out-of-domain acoustic training data to improve a speech recognizer's performance on a specific, independent application? In our experiments, we use Wallstreet Journal (WSJ) data to train a recognizer, which is adapted and evaluated in the Phonebook domain. Apart from their common language (US English), the two corpora di er in many important respects: microphone vs. telephone channel, continuous speech vs. isolated words, mismatch in speaking rate.

This paper deals with two questions. First, starting from the WSJ-trained recognizer, how much adaptation data (taken from the Phonebook training corpus) is necessary to achieve a reasonable recognition performance in spite of the high degree of mismatch? Second, is it possible to improve the recognition performance of a Phonebook-trained baseline acoustic model by using additional out-of-domain training data? The paper describes the adaptation and normalization techniques used to bridge the mismatch between the two corpora.


Cite as: Blasig, R., Rose, G., Meyer, C. (2000) Training of isolated word recognizers with continuous speech. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 449-452

@inproceedings{blasig00_icslp,
  author={Reinhard Blasig and Georg Rose and Carsten Meyer},
  title={{Training of isolated word recognizers with continuous speech}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 1, 449-452}
}