International Symposium on Chinese Spoken Language Processing (ISCSLP 2002)

Taipei, Taiwan
August 23-24, 2002

Partial Change Phone Models for Pronunciation Variations in Spontaneous Mandarin Speech

Yi Liu, Pascale Fung

Human Language Technology Center, Department of Electrical and Electronic Engineering, University of Science and Technology, Hong Kong

Modeling pronunciation variations is a critical part of spontaneous Mandarin speech recognition. Such variations include both complete changes and partial changes. Complete pronunciation changes can usually be modeled by using an alternative phone to replace the canonical phoneme. Partial changes are variations within the phoneme and include diacritics, which cannot be modeled by conventional methods. In this paper, we propose using partial change phone models to represent such changes. The pre-trained acoustic model is reconstructed by sharing Gaussian mixtures between canonical phone models and partial change phone models at the state level. We improve the resolution of the acoustic model to accommodate partial changes. The effectiveness of this approach is evaluated on the Hub4NE Mandarin Broadcast News Corpus. The syllable accuracy increased 2.59% absolutely with respect to the baseline.

