ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

Spontaneous speech language identification with a knowledge of linguistics

Shubha Kadambe, James L. Hieronymus

A task independent spoken Language Identification (LID) system for telephone speech is described. This system is based on continuos density second-order ergodic variable duration Hidden Markov phoneme models and trigram phonemotactic models. The language specific phoneme models are trained using "High accuracy phoneme recognition system" [1]. A trigram phonemotactic model for each language is trained using a text corpus of about 10 million words and a grapheme to phoneme converter. The language Li of an incoming speech signal x is hypothesized as the one that produced the highest likelihood P(x\fii)P(fii\Li) for all the phonemic models fit of a given set of phonemes per language. The LID results for three languages are presented. The effect of the phonemotactic model in distinguishing languages is demonstrated by comparing the LID results obtained with and without phonemotactic models.


Cite as: Kadambe, S., Hieronymus, J.L. (1994) Spontaneous speech language identification with a knowledge of linguistics. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 1879-1882

@inproceedings{kadambe94_icslp,
  author={Shubha Kadambe and James L. Hieronymus},
  title={{Spontaneous speech language identification with a knowledge of linguistics}},
  year=1994,
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},
  pages={1879--1882}
}