Native Language Detection Using the I-Vector Framework

Mohammed Senoussaoui, Patrick Cardinal, Najim Dehak, Alessandro L. Koerich


Native-language identification is the task of determining a speaker’s native language based only on their speeches in a second language. In this paper we propose the use of the well-known i-vector representation of the speech signal to detect the native language of an English speaker. The i-vector representation has shown an excellent performance on the quite similar task of distinguishing between different languages. We have evaluated different ways to extract i-vectors in order to adapt them to the specificities of the native language detection task. The experimental results on the 2016 ComParE Native language sub-challenge test set have shown that the proposed system based on a conventional i-vector extractor outperforms the baseline system with a 42% relative improvement.


DOI: 10.21437/Interspeech.2016-1473

Cite as

Senoussaoui, M., Cardinal, P., Dehak, N., Koerich, A.L. (2016) Native Language Detection Using the I-Vector Framework. Proc. Interspeech 2016, 2398-2402.

Bibtex
@inproceedings{Senoussaoui+2016,
author={Mohammed Senoussaoui and Patrick Cardinal and Najim Dehak and Alessandro L. Koerich},
title={Native Language Detection Using the I-Vector Framework},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1473},
url={http://dx.doi.org/10.21437/Interspeech.2016-1473},
pages={2398--2402}
}