ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

A multi-pass error detection and correction framework for Mandarin LVCSR

Zhengyu Zhou, Helen M. Meng, Wai Kit Lo

We previously proposed a multi-pass framework for Large Vocabulary Continuous Speech Recognition (LVCSR). The objective of this framework is to apply sophisticated linguistic models for recognition, while maintaining a balance between complexity and efficiency. The framework is composed of three passes: initial recognition, error detection and error correction. This paper presents and evaluates a prototype of the multi-pass framework based on Mandarin dictation. In this prototype, the first pass recognizes speech with a well-trained state-of-the-art recognizer incorporating an efficient language model; the second pass detects recognition errors by a new three-step error detection procedure; and the third pass corrects errors detected in those lightly erroneous utterances by a novel error correction approach. The error correction algorithm corrects recognition errors by first creating candidate lists for errors, and then re-ranking the candidates with a combined model of mutual information and trigram. Mandarin dictation experiments show a relative reduction of 4% in character error rate (CER) over the initial recognition performance based on those light erroneous utterances detected.

doi: 10.21437/Interspeech.2006-459

Cite as: Zhou, Z., Meng, H.M., Lo, W.K. (2006) A multi-pass error detection and correction framework for Mandarin LVCSR. Proc. Interspeech 2006, paper 1947-Wed1CaP.12, doi: 10.21437/Interspeech.2006-459

  author={Zhengyu Zhou and Helen M. Meng and Wai Kit Lo},
  title={{A multi-pass error detection and correction framework for Mandarin LVCSR}},
  booktitle={Proc. Interspeech 2006},
  pages={paper 1947-Wed1CaP.12},