Odyssey 2012 - The Speaker and Language Recognition Workshop

June 25-28, 2012

Complementary Combination in i-Vector Level for Language Recognition

Zhi-Yi Li, Wei-Qiang Zhang, Liang He, Jia Liu

Department of Electronic Engineering, Tsinghua University, Beijing, China

Recently, i-vector based technology can provide good performance in language recognition (LRE). From the viewpoint of information theory, i-vectors derived from different acoustic features can contain more useful and complementary language information. In this paper, we propose an effective complementary combination for two kinds of i-vectors. One is derived from the commonly used short-term spectral shifted delta cepstral (SDC) and the other from a novel spectro-temporal time-frequency cepstrum (TFC). In order to overcome the curse of dimension and to remove the redundant information in the combined i-vectors, we use principal component analysis (PCA) and linear discriminant analysis (LDA) and evaluate their performances, respectively. For classification, cosine distance scoring (CDS) and support vector machine (SVM) are applied to the new combined i-vectors. The experiments are performed on the NIST LRE 2009 dataset, and the results show that the proposed method can effectively improve the better performance than baseline by EER reducing 1% for 30 s duration and 2.3% for both 10 s and 3 s. Index Terms. i-vector combination, SDC, TFC, PCA, LDA, language recognition

Full Paper

Bibliographic reference.  Li, Zhi-Yi / Zhang, Wei-Qiang / He, Liang / Liu, Jia (2012): "Complementary combination in i-vector level for language recognition", In Odyssey-2012, 334-337.