ISCA International Workshop on Speech and Language Technology in Education (SLaTE 2011)

Venice, Italy
August 24-26, 2011

Statistical Machine Translation Framework for Modeling Phonological Errors in Computer Assisted Pronunciation Training System

Theban Stanley, Kadri Hacioglu, Bryan Pellom

Rosetta Stone Labs, Boulder, Colorado, USA

Computer Assisted Pronunciation Training (CAPT) is becoming more and more popular among language learners. Most effective CAPT systems take advantage of the learner’s L1 and cater exercises and feedback specific to the language transfer effects. This paper presents a statistical machine translation (MT) based approach to model salient phonological errors present in an L1 population. The output of the MT system is coupled with a speech recognition system to detect non-native phonological errors. On a Korean learners of English corpus, the MT approach shows a 32.9% relative improvement in phone error detection and a 49% relative improvement in phone error identification compared to edit distance based modeling techniques. Similar performance improvements were observed on Japanese learners of English corpus.
Index Terms. phonological error modeling, machine translation, speech recognition

Full Paper

Bibliographic reference.  Stanley, Theban / Hacioglu, Kadri / Pellom, Bryan (2011): "Statistical machine translation framework for modeling phonological errors in computer assisted pronunciation training system", In SLaTE-2011, 125-128.