EUROSPEECH 2003 - INTERSPEECH 2003
Recently, we have proposed a parallel sub-word recognition (PSWR) system for language identification (LID) in a framework similar to the parallel phone recognition (PPR) approach in the literature, but without requiring phonetic labeling of the speech data in any of the languages in the LID task. In this paper, we show the theoretical equivalence of PSWR and ergodic- HMM (E-HMM) based LID. Here, the front-end sub-word recognizer (SWR) and back-end language model (LM) of each language in PSWR correspond to the states and state-transitions of the E-HMM in that language. This equivalence unifies the parallel phone (sub-word) recognition and ergodic-HMM approaches, which have been treated as two distinct frameworks in the LID literature so far, thus providing further insights into both these frameworks. On a 6-language LID task using the OGI-TS database, the E-HMM system achieves performances comparable to the PSWR system, offering clear experimental validation of their equivalence.
Bibliographic reference. Ramasubramanian, V. / Jayram, A.K.V. Sai / Sreenivas, T.V. (2003): "Language identification using parallel sub-word recognition - an ergodic HMM equivalence", In EUROSPEECH-2003, 1357-1360.