Transparent Pronunciation Scoring Using Articulatorily Weighted Phoneme Edit Distance

Reima Karhila, Anna-Riikka Smolander, Sari Ylinen, Mikko Kurimo


For researching effects of gamification in foreign language learning for children in the “Say It Again, Kid!” project we developed a feedback paradigm that can drive gameplay in pronunciation learning games. We describe our scoring system based on the difference between a reference phone sequence and the output of a multilingual CTC phoneme recogniser. We present a white-box scoring model of mapped weighted Levenshtein edit distance between reference and error with error weights for articulatory differences computed from a training set of scored utterances. The system can produce a human-readable list of each detected mispronunciation’s contribution to the utterance score. We compare our scoring method to established black box methods.


 DOI: 10.21437/Interspeech.2019-1785

Cite as: Karhila, R., Smolander, A., Ylinen, S., Kurimo, M. (2019) Transparent Pronunciation Scoring Using Articulatorily Weighted Phoneme Edit Distance. Proc. Interspeech 2019, 1866-1870, DOI: 10.21437/Interspeech.2019-1785.


@inproceedings{Karhila2019,
  author={Reima Karhila and Anna-Riikka Smolander and Sari Ylinen and Mikko Kurimo},
  title={{Transparent Pronunciation Scoring Using Articulatorily Weighted Phoneme Edit Distance}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={1866--1870},
  doi={10.21437/Interspeech.2019-1785},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1785}
}