INTERSPEECH 2004 - ICSLP
In the recent automatic language identification research, phonotactic approach has been studied in which all training utterances are passed through a tokenizer in order to get phonetic sequences to train the language model of different languages. The true transcription of the utterances was totally ignored. However, information in the transcription may process important discriminating power for language identification. In this paper, we propose to use discrete hidden Markov model that takes account of the potential error patterns of the acoustic tokenizer and incorporates the transcription of the utterances in the language model training. Furthermore, with the DHMM approach, LID using multiple phonetic tokenizers can simply be considered as using a multi-dimensional features to the DHMM allowing the making of joint decision earlier in the process. A system employing this approach produces 59.00% and 68.33% accuracy on 10-sec and 45-sec speech respectively on recognizing a close set of six languages in the OGI telephone speech corpus while the phonotactic approach gives 57.00% and 77.50% recognition accuracy on 10-sec and 45-sec speech when the phone recognizer uses threestate and three-mixture HMM.
Bibliographic reference. Wong, Kakeung / Siu, Man-hung (2004): "Automatic language identification using discrete hidden Markov model", In INTERSPEECH-2004, 1633-1636.