Sixth European Conference on Speech Communication and Technology
In this paper different model configurations for language identification using a phonotactic approach are explored. Identification experiments were carried out on the 11-language telephone speech corpus OGI-TS, containing calls in French, English, German, Spanish, Japanese, Korean, Mandarin, Tamil, Farsi, Hindi, and Vietnamese. Phone sequences output by one or multiple phone recognizers are rescored with language-dependent phonotactic models approximated by phone bigrams. The parameters of different sets of acoustic phone models were estimated using the 4-language IDEAL corpus. Sets of language-specific phonotactic models were trained using the training portion of the OGI-TS corpus. Error rates are significantly reduced by combining language-dependent and language-independent acoustic decoders, especially for short segments. A 9.9% LID error rate was obtained on the 11-language task using phonotactic models trained on spontaneous speech data. These results show that the phonotactic approach is relative insensitive to an acoustic mismatch between training and test conditions.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Matrouf, D. / Adda-Decker, Martine / Gauvain, Jean-Luc / Lamel, Lori (1999): "Comparing different model configurations for language identification using a phonotactic approach", In EUROSPEECH'99, 387-390.