The aim of this paper is to investigate to what extent non native speech may deteriorate language identification (LID) performances and to improve them using acoustic adaptation. Our reference LID system is based on a phonotactic approach. The system makes use of language-independent acoustic models and language-specific phone-based bigram language models. Experiments are conducted on the SQALE test database, which contains recordings from English, French and German native speakers, and on the MIST database, which contains non-native speech in the same languages uttered by Dutch speakers. Using 5 seconds of telephone quality speech, language identification error rate amounts to 19% for native speech and to 31% for non-native speech, thus yielding about 60% relative error rate increase. Eventually we propose to improve non-native language identification by an adaptation of the acoustic models to the non-native speech.
Cite as: Wanneroy, R., Bilinski, E., Barras, C., Adda-Decker, M., Geoffrois, E. (1999) Acoustic-phonetic modeling of non-native speech for language identification. Proc. Multi-Lingual Interoperability in Speech Technology, 8-12
@inproceedings{wanneroy99_mist, author={R. Wanneroy and E. Bilinski and C. Barras and Martine Adda-Decker and Edouard Geoffrois}, title={{Acoustic-phonetic modeling of non-native speech for language identification}}, year=1999, booktitle={Proc. Multi-Lingual Interoperability in Speech Technology}, pages={8--12} }