EUROSPEECH 2003 - INTERSPEECH 2003
A speech recognition architecture combining topic detection and topic-dependent language modeling is proposed. In this architecture, a hierarchical back-off mechanism is introduced to improve system robustness. Detailed topic models are applied when topic detection is confident, and wider models that cover multiple topics are applied in cases of uncertainty. In this paper, two topic detection methods are evaluated for the architecture: unigram likelihood and SVM (Support Vector Machine). On the ATR Basic Travel Expression corpus, both topic detection methods provide a comparable reduction in WER of 10.0% and 11.1% respectively over a single language model system. Finally the proposed re-decoding approach is compared with an equivalent system based on re-scoring. It is shown that re-decoding is vital to provide optimal recognition performance.
Bibliographic reference. Lane, Ian R. / Kawahara, Tatsuya / Matsui, Tomoko / Nakamura, Satoshi (2003): "Hierarchical topic classification for dialog speech recognition based on language model switching", In EUROSPEECH-2003, 429-432.