Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Comparing Different Model Configurations for Language Identification Using a Phonotactic Approach

D. Matrouf (1,2), Martine Adda-Decker (1), Jean-Luc Gauvain (1), Lori Lamel (1)

(1) LIMSI-CNRS, Orsay, France
(2) LIA, University of Avignon, France

In this paper different model configurations for language identification using a phonotactic approach are explored. Identification experiments were carried out on the 11-language telephone speech corpus OGI-TS, containing calls in French, English, German, Spanish, Japanese, Korean, Mandarin, Tamil, Farsi, Hindi, and Vietnamese. Phone sequences output by one or multiple phone recognizers are rescored with language-dependent phonotactic models approximated by phone bigrams. The parameters of different sets of acoustic phone models were estimated using the 4-language IDEAL corpus. Sets of language-specific phonotactic models were trained using the training portion of the OGI-TS corpus. Error rates are significantly reduced by combining language-dependent and language-independent acoustic decoders, especially for short segments. A 9.9% LID error rate was obtained on the 11-language task using phonotactic models trained on spontaneous speech data. These results show that the phonotactic approach is relative insensitive to an acoustic mismatch between training and test conditions.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Matrouf, D. / Adda-Decker, Martine / Gauvain, Jean-Luc / Lamel, Lori (1999): "Comparing different model configurations for language identification using a phonotactic approach", In EUROSPEECH'99, 387-390.