In this paper, we present a tone recognition framework for continuous Mandarin speech. To model the variations of F0 pattern caused by co-articulation and phonetic effects, a set of discriminating features are extracted: 1) outlined features from the F0 contours of target syllable and neighboring syllables are combined; 2) contextual tone information is utilized within an iterative process; 3) phonetic information from target and neighboring syllables is incorporated. These features are put into a decision tree for tone classification, which follows an HMM-based toneless decoder. The results in 5-tone recognition experiments show more than 40% relative error rate reduction against the baseline local outlined features. Moreover, the proposed method obviously outperforms HMM-based tone model in speaker-independent evaluation.
Cite as: He, L., Hao, J. (2006) A tone recognition framework for continuous Mandarin speech. Proc. Interspeech 2006, paper 1348-Wed1BuP.7, doi: 10.21437/Interspeech.2006-441
@inproceedings{he06_interspeech, author={Lei He and Jie Hao}, title={{A tone recognition framework for continuous Mandarin speech}}, year=2006, booktitle={Proc. Interspeech 2006}, pages={paper 1348-Wed1BuP.7}, doi={10.21437/Interspeech.2006-441} }