We present iCALL, a speech corpus designed to evaluate Mandarin Chinese pronunciation patterns of non-native speakers of European descent, developed at the Institute for Infocomm Research (I2R) in Singapore. To the best of our knowledge, iCALL is larger than any reported non-native corpora to date in terms of utterance number, duration, and number of speakers: iCALL consists of 90,841 utterances from 305 speakers with a total duration of 142 hours. The speakers are gender-balanced, from a diverse native language background, and represent a realistic sampling of the adult age of Mandarin learners. The read utterances are phonetically balanced and are of varying lengths (words, phrases, and sentences). The spoken utterances are phonetically transcribed and perceptually rated with fluency scores by trained native speakers of Mandarin. In this work, we share our experience in corpus design, data collection, and human annotation and analyze phonetic and tonal error patterns, in particular their relationship with speaker demographics and utterance length. Potential applications of the iCALL corpus include computer-assisted pronunciation training (CAPT), lexical tone recognition, automatic fluency assessment, accent recognition, and accented Mandarin speech recognition.
Bibliographic reference. Chen, Nancy F. / Tong, Rong / Wee, Darren / Lee, Peixuan / Ma, Bin / Li, Haizhou (2015): "iCALL corpus: Mandarin Chinese spoken by non-native speakers of European descent", In INTERSPEECH-2015, 324-328.