Third International Conference on Spoken Language Processing (ICSLP 94)
This paper describes three methods of language identification, all of which are based on HMMs(Hidden Markov Models). Here, we focused mainly on language identification for English and Japanese. In the first method,a fully-structured(ergodic) HMM was trained for each language using text-independent speech samples from 10 native speakers. The likelihood for each language is calculated for the input speech using this HMM. In the second method, a universal ergodic HMM is trained using all the language data and with it, the most likely state sequence is computed for each language. The state sequence derived is processed and is used in the construction of trigram models for each language. The trigram model was used for modeling the phonotactics for each language. For the third method, a set of phonemic/syllabic HMMs are trained: 60 phonemic HMMs for English and 113 syllabic HMMs for Japanese. With this system, the most likely sequence of phonemes/syllables of each language and its likelihood are determined for the speech input. Separate test data were provided. For these test data, the identification for the first method was 96.5% on English-Japanese identification. The second method gave a performance of 85.6%, while for the third method, it was 93.9%. Language identification tests were also performed on a 4-language(English, Japanese, Chinese and Indonesian) database. Combining the first and second methods gave a best performance of 98.4% for utterances lasting 10 seconds.
Bibliographic reference. Reyes, Allan A. / Seino, Takashi / Nakagawa, Seiichi (1994): "Three language identification methods based on HMMs", In ICSLP-1994, 1895-1898.