12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

iVector Approach to Phonotactic Language Recognition

Mehdi Soufifar (1), Marcel Kockmann (2), Lukáš Burget (2), Oldřich Plchot (2), Ondřej Glembek (2), Torbjørn Svendsen (1)

(1) NTNU, Norway
(2) Brno University of Technology, Czech Republic

This paper addresses a novel technique for representation and processing of n-gram counts in phonotactic language recognition (LRE): subspace multinomial modelling represents the vectors of n-gram counts by low dimensional vectors of coordinates in total variability subspace, called iVector. Two techniques for iVector scoring are tested: support vector machines (SVM), and logistic regression (LR). Using standard NIST LRE 2009 task as our evaluation set, the latter scoring approach was shown to outperform phonotactic LRE system based on direct SVM classification of n-gram count vectors. The proposed iVector paradigm also shows comparable results to previously proposed PCA-based phonotactic feature extraction.

Full Paper

Bibliographic reference.  Soufifar, Mehdi / Kockmann, Marcel / Burget, Lukáš / Plchot, Oldřich / Glembek, Ondřej / Svendsen, Torbjørn (2011): "ivector approach to phonotactic language recognition", In INTERSPEECH-2011, 2913-2916.