EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Hierarchical Topic Classification for Dialog Speech Recognition Based on Language Model Switching

Ian R. Lane (1), Tatsuya Kawahara (1), Tomoko Matsui (2), Satoshi Nakamura (3)

(1) Kyoto University, Japan
(2) Institute of Statistical Mathematics, Japan
(3) ATR-SLT, Japan

A speech recognition architecture combining topic detection and topic-dependent language modeling is proposed. In this architecture, a hierarchical back-off mechanism is introduced to improve system robustness. Detailed topic models are applied when topic detection is confident, and wider models that cover multiple topics are applied in cases of uncertainty. In this paper, two topic detection methods are evaluated for the architecture: unigram likelihood and SVM (Support Vector Machine). On the ATR Basic Travel Expression corpus, both topic detection methods provide a comparable reduction in WER of 10.0% and 11.1% respectively over a single language model system. Finally the proposed re-decoding approach is compared with an equivalent system based on re-scoring. It is shown that re-decoding is vital to provide optimal recognition performance.

Full Paper

Bibliographic reference.  Lane, Ian R. / Kawahara, Tatsuya / Matsui, Tomoko / Nakamura, Satoshi (2003): "Hierarchical topic classification for dialog speech recognition based on language model switching", In EUROSPEECH-2003, 429-432.