7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

A Confidence Measure Based on Agreement Among Multiple LVCSR Models - Correlation Between Pair of Acoustic Models and Confidence

Takehito Utsuro, Tetsuji Harada, Hiromitsu Nishizaki, Seiichi Nakagawa

Toyohashi University of Technology, Japan

For many practical applications of speech recognition systems, it is quite desirable to have an estimate of confidence for each hypothesized word. Unlike previous works on confidence measures, this paper studies features for confidence measures that are extracted from outputs of more than one LVCSR models. More specifically, this paper experimentally evaluates the agreement among the outputs of multiple Japanese LVCSR models, with respect to whether it is effective as an estimate of confidence for each hypothesized word. The results of experimental evaluation show that the agreement between the outputs with two LVCSR models with different decoders and acoustic models can achieve quite reliable confidence. Furthermore, among various features of acoustic models based on Gaussian mixture HMMs, it is concluded that ones such as whether or not to have short pause models, as well as different units in HMMs (e.g., triphone model or syllable model) are the most effective in achieving highly reliable confidence.


Full Paper

Bibliographic reference.  Utsuro, Takehito / Harada, Tetsuji / Nishizaki, Hiromitsu / Nakagawa, Seiichi (2002): "A confidence measure based on agreement among multiple LVCSR models - correlation between pair of acoustic models and confidence", In ICSLP-2002, 701-704.