Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Automatic Detection of Phone-Level Mispronunciation for Language Learning

Horacio Franco, Leonardo Neumeyer, Marķa Ramos, Harry Bratt

SRI International, Speech Technology and Research Laboratory, Menlo Park, CA, USA

We are interested in automatically detecting specific phone seg-ments that have been mispronounced by a nonnative student of a foreign language. The phone-level information allows a language instruction system to provide the student with feedback about specific pronunciation mistakes. Two approaches were evaluated; in the first approach, log-posterior probability-based scores [1] are computed for each phone segment. These probabilities are based on acoustic models of native speech. The sec-ond approach uses a phonetically labeled nonnative speech database to train two different acoustic models for each phone: one model is trained with the acceptable, or correct native-like pronunciations, while the other model is trained with the incorrect, strongly nonnative pronunciations. For each phone seg-ment, a log-likelihood ratio score is computed using the incorrect and correct pronunciation models. Either type of score is compared with a phone dependent threshold to detect a mispronunciation. Performance of both approaches was evaluated in a phonetically transcribed database of 130,000 phones uttered in continuous speech sentences by 206 nonnative speakers.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Franco, Horacio / Neumeyer, Leonardo / Ramos, Marķa / Bratt, Harry (1999): "Automatic detection of phone-level mispronunciation for language learning", In EUROSPEECH'99, 851-854.