8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Mis-Recognized Utterance Detection Using Hierarchical Language Model

Hirofumi Yamamoto (1), Genichiro Kikui (1), Yoshinori Sagisaka (2)

(1) ATR Spoken Language Translation Research Labs., Japan
(2) Waseda Univ., Japan

In this paper, a mis-recognized utterance detection and modification scheme is proposed to recover speech recognition errors in speech translation. In a speech recognition stage, mis-recognition is frequently observed. The most of mis-recognitions result from mismatch of acoustic models and out-of-vocabulary (OOV) words. To cope with both acoustic model mis-match and OOVs, we adopt a hierarchical language model to identify them. A hierarchical language model can generate both hypotheses with and without OOVs (or acoustic mis-matched words). Likelihood difference of these hypotheses is used as utterance confidence measure. To confirm the possibility of this scheme, as a first experiment, we have conducted speech recognition experiments and mis-recognized utterance detection. Experiment results showed 99% detection rate for utterances with OOVs.

Full Paper

Bibliographic reference.  Yamamoto, Hirofumi / Kikui, Genichiro / Sagisaka, Yoshinori (2004): "Mis-recognized utterance detection using hierarchical language model", In INTERSPEECH-2004, 1025-1028.