ISCA Archive ISCSLP 2002
ISCA Archive ISCSLP 2002

Partial change phone models for pronunciation variations in spontaneous Mandarin speech

Yi Liu, Pascale Fung

Modeling pronunciation variations is a critical part of spontaneous Mandarin speech recognition. Such variations include both complete changes and partial changes. Complete pronunciation changes can usually be modeled by using an alternative phone to replace the canonical phoneme. Partial changes are variations within the phoneme and include diacritics, which cannot be modeled by conventional methods. In this paper, we propose using partial change phone models to represent such changes. The pre-trained acoustic model is reconstructed by sharing Gaussian mixtures between canonical phone models and partial change phone models at the state level. We improve the resolution of the acoustic model to accommodate partial changes. The effectiveness of this approach is evaluated on the Hub4NE Mandarin Broadcast News Corpus. The syllable accuracy increased 2.59% absolutely with respect to the baseline.


Cite as: Liu, Y., Fung, P. (2002) Partial change phone models for pronunciation variations in spontaneous Mandarin speech. Proc. International Symposium on Chinese Spoken Language Processing, paper 27

@inproceedings{liu02b_iscslp,
  author={Yi Liu and Pascale Fung},
  title={{Partial change phone models for pronunciation variations in spontaneous Mandarin speech}},
  year=2002,
  booktitle={Proc. International Symposium on Chinese Spoken Language Processing},
  pages={paper 27}
}