8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Spoken Language Identification Using Score Vector Modeling and Support Vector Machine

Ming Li, Hongbin Suo, Xiao Wu, Ping Lu, Yonghong Yan

Chinese Academy of Sciences, China

The support vector machine (SVM) framework based on generalized linear discriminate sequence (GLDS) kernel has been shown effective and widely used in language identification tasks. In this paper, in order to compensate the distortions due to inter-speaker variability within the same language and solve the practical limitation of computer memory requested by large database training, multiple speaker group based discriminative classifiers are employed to map the cepstral features of speech utterances into discriminative language characterization score vectors (DLCSV). Furthermore, backend SVM classifiers are used to model the probability distribution of each target language in the DLCSV space and the output scores of backend classifiers are calibrated as the final language recognition scores by a pair-wise posterior probability estimation algorithm. The proposed SVM framework is evaluated on 2003 NIST Language Recognition Evaluation databases, achieving an equal error rate of 4.0% in 30-second tasks, which outperformed the state-of-art SVM system by more than 30% relative error reduction.

Full Paper

Bibliographic reference.  Li, Ming / Suo, Hongbin / Wu, Xiao / Lu, Ping / Yan, Yonghong (2007): "Spoken language identification using score vector modeling and support vector machine", In INTERSPEECH-2007, 350-353.