Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

An Intelligent and Efficient Word-Class-Based Chinese Language Model for Mandarin Speech Recognition with Very Large Vocabulary

Yen-Ju Yang (1), Sung-Chien Lin (1), Lee-Feng Chien (2), Keh-Jiann Chen (2), Lin-Shan Lee (1,2,3)

(1) Dept. of Computer Science and Information Engineering, National Taiwan University, Taiwan
(2) Institute of Information Science, Academia Sinica, Taiwan
(3) Dept of Electrical Engineering, National Taiwan University Taipei, Taiwan

This paper proposes a word-class-based Chinese language model for Mandarin speech recognition with very large vocabulary. The word classes used are developed based on the special structure of Chinese words. We have also developed some improved techniques. The ambiguous syllable filter can delete many confusion syllables and increase significantly the accuracy. The short-term cache memory can help the language model to adapt to the current application domain, and the learning module can significantly reduce the zero values in the language model.

Full Paper

Bibliographic reference.  Yang, Yen-Ju / Lin, Sung-Chien / Chien, Lee-Feng / Chen, Keh-Jiann / Lee, Lin-Shan (1994): "An intelligent and efficient word-class-based Chinese language model for Mandarin speech recognition with very large vocabulary", In ICSLP-1994, 1371-1374.