EUROSPEECH 2003 - INTERSPEECH 2003
In this paper, we propose a statistical method of evaluating the pronunciation proficiency of English words spoken by Japanese. We analyze statistically the utterances to find a combination that has a high correlation between an English teacher's score and some acoustic features. We found that the likelihood ratio of English phoneme acoustic models to phoneme acoustic models adapted by Japanese was the best measure of pronunciation proficiency. The combination of the likelihood for American native models, likelihood for English models adapted by Japanese, the best likelihood for arbitrary sequences of acoustic models, phoneme recognition rate and the rate of speech are highly related to the English teacher's score. We obtained the correlation coefficient of 0.81 with open data for vocabulary and 0.69 with open data for speaker at the five words set level, respectively. The coefficient was higher than the correlation between humans' scores, 0.65.
Bibliographic reference. Nakagawa, Seiichi / Mori, Kazumasa / Nakamura, Naoki (2003): "A statistical method of evaluating pronunciation proficiency for English words spoken by Japanese", In EUROSPEECH-2003, 3193-3196.