This paper evaluates and compares different approaches to collecting judgments about pronunciation accuracy of non-native speech. We compare the common approach, which requires expert linguists to provide a detailed phonetic transcription of non-native English speech, with word-level judgments collected from multiple naïve listeners using a crowd-sourcing platform. In both cases we found low agreement between annotators on what words should be marked as errors. We compare the error detection task to a simple transcription task in which the annotators were asked to transcribe the same fragments using standard English spelling. We argue that the transcription task is a simpler and more practical way of collecting annotations which also leads to more valid data for training an automatic scoring system.
Bibliographic reference. Loukina, Anastassia / Lopez, Melissa / Evanini, Keelan / Suendermann-Oeft, David / Zechner, Klaus (2015): "Expert and crowdsourced annotation of pronunciation errors for automatic scoring systems", In INTERSPEECH-2015, 2809-2813.