5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Predicting, Diagnosing and Improving Automatic Language Identification Performance

Marc A. Zissman

Lincoln Laboratory, Massachusetts Institute of Technology, Lexington, MA, USA

Language-identification (LID) techniques that use multiple single-language phoneme recognizers followed by n-gram language models have consistently yielded top performance at NIST evaluations. In our study of such systems, we have recently cut our LID error rate by modeling the output of n-gram language models more carefully. Additionally, we are now able to produce meaningful confidence scores along with our LID hypotheses. Finally, we have developed some diagnostic measures that can predict performance of our LID algorithms.

Full Paper

Bibliographic reference.  Zissman, Marc A. (1997): "Predicting, diagnosing and improving automatic language identification performance", In EUROSPEECH-1997, 51-54.