Pronunciation Error Detection for New Language Learners

Sean Robertson, Cosmin Munteanu, Gerald Penn

Existing pronunciation error detection research assumes that second language learners’ speech is advanced enough that its segments are generally well articulated. However, learners just beginning their studies, especially when those studies are organized according to western, dialogue-driven pedagogies, are unlikely to abide by those assumptions. This paper presents an evaluation of pronunciation error detectors on the utterances of second language learners just beginning their studies. A corpus of nonnative speech data is collected through an experimental application teaching beginner French. Word-level binary labels are acquired through successive pairwise comparisons made by language experts with years of experience teaching. Six error detectors are trained to classify these data: a classifier inspired by phonetic distance algorithms; the Goodness of Pronunciation classifier [1]; and four GMM-based discriminative classifiers modelled after [2]. Three partitioning strategies for 4-fold cross-validation are tested: one based on corpus distribution, another leaving speakers out, and another leaving annotators out. The best error detector, a log-likelihood ratio of native versus nonnative GMMs, achieved detector-annotator agreement of up to κ = .41, near the expected between-annotator agreement.

DOI: 10.21437/Interspeech.2016-539

Cite as

Robertson, S., Munteanu, C., Penn, G. (2016) Pronunciation Error Detection for New Language Learners. Proc. Interspeech 2016, 2691-2695.

author={Sean Robertson and Cosmin Munteanu and Gerald Penn},
title={Pronunciation Error Detection for New Language Learners},
booktitle={Interspeech 2016},