Sixth European Conference on Speech Communication and Technology
This paper presents a new approach to tone modeling for continuous Mandarin speech recognition. Mandarin tones provide rich information for speech recognition. In this paper, we treat the tone as an attribute of the final vowel part of a Mandarin syllable. Separate distributions are estimated for cepstral coefficients and pitch features respectively, and the phonetic state tied-mixture technique is exploited to achieve improved modeling. Several tying structures are investigated, and the results are compared with that without using tonal parameters. After integrating tone models, decent improvements can be achieved in large vocabulary continuous Mandarin speech recognition. Besides, this approach can be easily incorporated into the one-pass Viterbi search framework for practical implementation of Mandarin dictation system.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Ho, Tai-Hsuan / Liu, Chin-Jung / Sun, Herman / Tsai, Ming-Yi / Lee, Lin-Shan (1999): "Phonetic state tied-mixture tone modeling for large vocabulary continuous Mandarin speech recognition", In EUROSPEECH'99, 883-886.