Sixth International Conference on Spoken Language Processing
In this paper, we present the Philips large vocabulary continuous Mandarin speech recognition system developed for the 2000 Taiwan Speech Input Technology Assessment. We systematically integrated key Mandarin components with up-todate Western-language techniques to build up a state-of-the-art Mandarin speech recognition system. These technologies include robust pitch extraction/tone modeling, context-dependent preme/core-final units, Chinese phrase/syllable trigram language model, linear discriminant analysis (LDA), cross-syllable modeling/decoding, speaker clustering and maximum likelihood linear regression (MLLR) adaptation. Among them, the major breakthroughs were our robust pitch extraction/tone modeling technology and the treatment of coarticulation across syllable boundaries. For the development set, we dramatically reduced last year’s best error rates by relative 44.8%~67.8% on all three categories we participated. Moreover, for the evaluation set, we achieved the lowest unit error rates on all three categories.
Bibliographic reference. Liao, Yuan-Fu / Wang, Nick / Huang, Max / Huang, Hank / Seide, Frank (2000): "Improvements of the Philips 2000 Taiwan Mandarin benchmark system", In ICSLP-2000, vol.4, 298-301.