This paper proposes a paradigm where commonly made segmental pronunciation errors are modeled as pair-wise confusions between two or more phonemes in the language that is being learnt. The method uses an ensemble of support vector machine classifiers with time varying Mel frequency cepstral features to distinguish between several pairs of phonemes. These classifiers are then applied to classify the phonemes uttered by second language learners. Instead of providing feedback at every mispronounced phoneme, the method attempts to provide feedback about typical mispronunciations by a certain student, over an entire session of several utterances. Two case studies that demonstrate how the paradigm is applied to provide suitable feedback to two students is also described in this paper.
Index Terms. Support Vector Machines, Time Varying-MFCC, CAPT
Cite as: Ananthakrishnan, G., Wik, P., Engwall, O., Abdou, S. (2011) Using an ensemble of classifiers for mispronunciation feedback. Proc. Speech and Language Technology in Education (SLaTE 2011), 49-52
@inproceedings{ananthakrishnan11_slate, author={Gopal Ananthakrishnan and Preben Wik and Olov Engwall and Sherif Abdou}, title={{Using an ensemble of classifiers for mispronunciation feedback}}, year=2011, booktitle={Proc. Speech and Language Technology in Education (SLaTE 2011)}, pages={49--52} }