EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Acoustic, Phonetic, and Discriminative Approaches to Automatic Language Identification

E. Singer, P.A. Torres-Carrasquillo, T.P. Gleason, W.M. Campbell, Douglas A. Reynolds

Massachusetts Institute of Technology, USA

Formal evaluations conducted by NIST in 1996 demonstrated that systems that used parallel banks of tokenizer-dependent language models produced the best language identification performance. Since that time, other approaches to language identification have been developed that match or surpass the performance of phone-based systems. This paper describes and evaluates three techniques that have been applied to the language identification problem: phone recognition, Gaussian mixture modeling, and support vector machine classification. A recognizer that fuses the scores of three systems that employ these techniques produces a 2.7% equal error rate (EER) on the 1996 NIST evaluation set and a 2.8% EER on the NIST 2003 primary condition evaluation set. An approach to dealing with the problem of out-of-set data is also discussed.

Full Paper

Bibliographic reference.  Singer, E. / Torres-Carrasquillo, P.A. / Gleason, T.P. / Campbell, W.M. / Reynolds, Douglas A. (2003): "Acoustic, phonetic, and discriminative approaches to automatic language identification", In EUROSPEECH-2003, 1345-1348.