In this article we present a novel method for automatic pronunciation error detection of children's speech. A phone graph is generated from the audio segment and augmented if necessary with alignments of phonetic transcriptions of the word to score. This graph is used for extracting phone-level features using conventional HMM/GMM acoustic scores and Support Vector Machine (SVM) classifiers acting as probabilistic estimators. Finally an SVM is used to combine the phone-level features extracted for each word to produce a word-based pronunciation score. Experimental results show that the proposed method and features can be effectively used for pronunciation scoring. In particular the detection of mispronunciations is increased more than 22% with respect to the baseline.
Bibliographic reference. Bolanos, Daniel / Ward, Wayne / Wise, Barbara / Vuuren, Sarel van (2008): "Pronunciation error detection techniques for children's speech", In INTERSPEECH-2008, 1725-1728.