EUROSPEECH 2001 Scandinavia
The pronunciation variability is an important issue that must be faced with when developing practical automatic spontaneous speech recognition systems. By studying the initial/final (IF) characteristics of Chinese language and developing the Bayesian equation, we propose the concepts of generalized initial/final (GIF) and generalized syllable (GS), the GIF modeling method and the IF-GIF modeling method, as well as the context-dependent pronunciation weighting method. By using these approaches, the IF-GIF modeling reduces the Chinese syllable error rate (SER) by 6.3% and 4.2% compared with the GIF modeling and IF modeling respectively when the language modeling, such as syllable or word N-gram, is not used.
Bibliographic reference. Zheng, Fang / Song, Zhanjiang / Fung, Pascale / Byrne, William (2001): "Modeling pronunciation variation using context-dependent weighting and b/s refined acoustic modeling", In EUROSPEECH-2001, 57-60.