The choice of basic modeling unit in building acoustic model for a continuous Mandarin speech recognition task is a very important issue . Unlike traditional phoneme or Initial/Finals (IFs) units based acoustic modeling methods, which usually suffer from the limitations of less accuracy in modeling intra-syllable variations and long scale temporal dependencies, in this paper, a practicable syllable based approach is presented. In contrast with IFs, syllable can implicitly model the intra-syllable variations in good accuracy. Also, by carefully choosing context modeling schemes and parameter tying methods, syllable based acoustic model can capture longer temporal variations while keeping the complexity of model well controlled. Meanwhile, considering the data unbalanced problem, multiple sized unit model based approaches are also implemented in this research. The experiment result shows the acoustic model based on the presented syllable based approach is effective in improving the performance of the Chinese continuous speech recognition.
Bibliographic reference. Wu, Hao / Wu, Xihong (2007): "Context dependent syllable acoustic model for continuous Chinese speech recognition", In INTERSPEECH-2007, 1713-1716.