ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

The use of dynamic reliability scoring in speech recognition

Xiaolong Mou, Victor Zue

Typically, along a recognizer’s search path, some acoustic units are modeled more reliably than others, due to differences in their acoustic-phonetic features and many other factors. This paper presents a dynamic reliability scoring scheme which can help adjust the partial path scores while the recognizer searches through the composed lexical and acoustic-phonetic network. The reliability models are trained on the acoustic scores of the correct arc and its immediate competing arcs extending the current partial path. During recognition, if, according to the trained reliability models, an arc can be more easily distinguished from the competing alternatives, that arc is more likely to be in the right path, and the partial path score can be adjusted accordingly on the fly to have a more accurate path hypothesis. We have applied this reliability scoring mechanism in two weather related domains, JUPITER [1] (for English) and PANDA (a predecessor of MUXING [2] for Mandarin Chinese). We get 9.8% word error rate (WER) reduction in the JUPITER domain and 12.4% WER reduction in the PANDA domain, thus demonstrating the effectiveness of this approach.

s V. Zue, S. Seneff, J. Glass, J. Polifroni, C. Pao, T. Hazen, and L. Hetherington, "JUPITER: A telephone-based conversational interface for weather information," IEEE Trans. on Speech and Audio Processing, vol. 8, no. 1, pp. 85-96, Jan. 2000. C. Wang, S. Cyphers, X. Mou, J. Polifroni, S. Seneff, J. Yi, and V. Zue, "A telephone-access mandarin conversational system in the weather domain," in these proceedings.


Cite as: Mou, X., Zue, V. (2000) The use of dynamic reliability scoring in speech recognition. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 4, 492-495

@inproceedings{mou00_icslp,
  author={Xiaolong Mou and Victor Zue},
  title={{The use of dynamic reliability scoring in speech recognition}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 4, 492-495}
}