ISCA International Workshop on Speech and Language Technology in Education (SLaTE 2011)

Venice, Italy
August 24-26, 2011

Using an Ensemble of Classifiers for Mispronunciation Feedback

Gopal Ananthakrishnan (1), Preben Wik (1), Olov Engwall (1), Sherif Abdou (2)

(1) Centre for Speech Technology, KTH (Royal Institute of Technology), Stockholm, Sweden
(2) Faculty of Computers & Information, Cairo University, Egypt

This paper proposes a paradigm where commonly made segmental pronunciation errors are modeled as pair-wise confusions between two or more phonemes in the language that is being learnt. The method uses an ensemble of support vector machine classifiers with time varying Mel frequency cepstral features to distinguish between several pairs of phonemes. These classifiers are then applied to classify the phonemes uttered by second language learners. Instead of providing feedback at every mispronounced phoneme, the method attempts to provide feedback about typical mispronunciations by a certain student, over an entire session of several utterances. Two case studies that demonstrate how the paradigm is applied to provide suitable feedback to two students is also described in this paper.
Index Terms. Support Vector Machines, Time Varying-MFCC, CAPT

Full Paper

Bibliographic reference.  Ananthakrishnan, Gopal / Wik, Preben / Engwall, Olov / Abdou, Sherif (2011): "Using an ensemble of classifiers for mispronunciation feedback", In SLaTE-2011, 49-52.