Sixth European Conference on Speech Communication and Technology
We are interested in automatically detecting specific phone seg-ments that have been mispronounced by a nonnative student of a foreign language. The phone-level information allows a language instruction system to provide the student with feedback about specific pronunciation mistakes. Two approaches were evaluated; in the first approach, log-posterior probability-based scores  are computed for each phone segment. These probabilities are based on acoustic models of native speech. The sec-ond approach uses a phonetically labeled nonnative speech database to train two different acoustic models for each phone: one model is trained with the acceptable, or correct native-like pronunciations, while the other model is trained with the incorrect, strongly nonnative pronunciations. For each phone seg-ment, a log-likelihood ratio score is computed using the incorrect and correct pronunciation models. Either type of score is compared with a phone dependent threshold to detect a mispronunciation. Performance of both approaches was evaluated in a phonetically transcribed database of 130,000 phones uttered in continuous speech sentences by 206 nonnative speakers.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Franco, Horacio / Neumeyer, Leonardo / Ramos, Marķa / Bratt, Harry (1999): "Automatic detection of phone-level mispronunciation for language learning", In EUROSPEECH'99, 851-854.