9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Mispronunciation Detection for Mandarin Chinese

Chao Huang, Feng Zhang, Frank K. Soong, Min Chu

Microsoft Research Asia, China

In this paper, we propose several reliable weighting factors based on the speaker's proficiency level, which can be used to normalize the scaled log-posterior probability (SLPP) to further improve mispronunciation detection at syllable level for Mandarin Chinese. Experiments based on a database consisting of 8000 syllables, pronounced by 40 speakers with varied pronunciation proficiency, shows the very promising effectiveness of these normalization schemes by reducing FAR from 44.4% to 35.1% on average and greatly improving automatic mispronunciation detection (AMD) performance greatly. In addition, we have attempted to investigate and analyze underlying behavior of such normalization factors. Some modifications, extensions and possible applications of such factors in real usage cases are also discussed.

Full Paper

Bibliographic reference.  Huang, Chao / Zhang, Feng / Soong, Frank K. / Chu, Min (2008): "Mispronunciation detection for Mandarin Chinese", In INTERSPEECH-2008, 2655-2658.