Sixth European Conference on Speech Communication and Technology
(EUROSPEECH'99)

Budapest, Hungary
September 5-9, 1999

Phonetic State Tied-Mixture Tone Modeling for Large Vocabulary Continuous Mandarin Speech Recognition

Tai-Hsuan Ho (1,2), Chin-Jung Liu (1), Herman Sun (1), Ming-Yi Tsai (3), Lin-Shan Lee (2,3)

(1) Applied Speech Technologies, Inc. Taiwan
(2) National Taiwan University, Department of Computer Science & Information Engineering, Taiwan
(3) National Taiwan University, Department of Electrical Engineering, Taiwan

This paper presents a new approach to tone modeling for continuous Mandarin speech recognition. Mandarin tones provide rich information for speech recognition. In this paper, we treat the tone as an attribute of the final vowel part of a Mandarin syllable. Separate distributions are estimated for cepstral coefficients and pitch features respectively, and the phonetic state tied-mixture technique is exploited to achieve improved modeling. Several tying structures are investigated, and the results are compared with that without using tonal parameters. After integrating tone models, decent improvements can be achieved in large vocabulary continuous Mandarin speech recognition. Besides, this approach can be easily incorporated into the one-pass Viterbi search framework for practical implementation of Mandarin dictation system.


Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Ho, Tai-Hsuan / Liu, Chin-Jung / Sun, Herman / Tsai, Ming-Yi / Lee, Lin-Shan (1999): "Phonetic state tied-mixture tone modeling for large vocabulary continuous Mandarin speech recognition", In EUROSPEECH'99, 883-886.