Automatic Pronunciation Evaluation of Non-Native Mandarin Tone by Using Multi-Level Confidence Measures

Ju Lin, Yanlu Xie, Jinsong Zhang


Automatic evaluation of tonal production plays an important role in a tonal language Computer-Assisted Pronunciation Training (CAPT) system. In this paper, we propose an automatic evaluation method for non-native Mandarin tones. The method applied multi-level confidence measures generated from Deep Neural Network (DNN). The confidence measures consisted of Log Posterior Ratios (LPR), Average Frame-level Log Posteriors (AFLP) and Segment-level Log Posteriors (SLP). The LPR was calculated between the correct tone model and competing tone models. The AFLP and LPR were obtained from frame-level scores. And the SLP was directly derived from segment-level scores. The multi-level confidence measures were modeled with a support vector machine (SVM) classifier. For comparison, three experiments were conducted according to different features: AFLP+LPR, SLP only and AFLP+LPR+SLP. The experimental results showed that the performance of the system which used multi-level confidence measures was the best, achieving a FRR of 5.63% and a DA of 82.45%, which demonstrated the efficiency of the proposed method.


DOI: 10.21437/Interspeech.2016-1162

Cite as

Lin, J., Xie, Y., Zhang, J. (2016) Automatic Pronunciation Evaluation of Non-Native Mandarin Tone by Using Multi-Level Confidence Measures. Proc. Interspeech 2016, 2666-2670.

Bibtex
@inproceedings{Lin+2016,
author={Ju Lin and Yanlu Xie and Jinsong Zhang},
title={Automatic Pronunciation Evaluation of Non-Native Mandarin Tone by Using Multi-Level Confidence Measures},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1162},
url={http://dx.doi.org/10.21437/Interspeech.2016-1162},
pages={2666--2670}
}