SLaTE 2015 - Workshop on Speech and Language Technology in Education
We extend the Goodness of Pronunciation (GOP) algorithm from the conventional GMM-HMM to DNNHMM and further optimize the GOP measure toward L2 language learners accented speech. We evaluate the performance of the new proposed approach at phone-level mispronunciation detection and diagnosis on an L2 English learners corpus. Experimental results show that the Equal Error Rate (EER) is improved from 32.9% to 27.0% by extending GOP from GMM-HMM to DNN-HMM and the EER can be further improved by another 1.5% to 25.5% with our optimized GOP measure. For phone mispronunciation diagnosis, by applying our optimized DNN based GOP measure, the top-1 error rate is reduced from 61.0% to 51.4 %, compared with the original DNN based one, and the top-5 error rate is reduced from 8.4% to 5.2 %. On a continuously read, L2 Mandarin learners corpus, our approaches also achieve similar improvements.
Bibliographic reference. Hu, Wenping / Qian, Yao / Soong, Frank K. (2015): "An improved DNN-based approach to mispronunciation detection and diagnosis of L2 learners speech", In SLaTE-2015, 71-76.