ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

Recent improvements in an approach to segment-based automatic language identification

Timothy J. Hazen, Victor W. Zue

In 1993, a segment-based system for Automatic Language Identification (ALI) was developed and introduced. The system incorporates phonetic, acoustic, and prosodic information within a probabilistic framework. The original system was trained and tested using the OGI Multi-Language Telephone Speech Corpus and achieved an accuracy of 57.3% in identifying the language of test utterances from the OGI corpus. Recent improvements to the system have included the addition of channel normalization during preprocessing, the utilization of the recently transcribed utterances from the OGI corpus for phonetic recognition training, the use of mixture Gaussian density functions for the modeling of prosodic information, and the development of a hill-climbing optimization procedure for determining the scaling factors used when combining the scores from different models. The current system has achieved an accuracy of 79.7% in identifying the language of test utterances.


Cite as: Hazen, T.J., Zue, V.W. (1994) Recent improvements in an approach to segment-based automatic language identification. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 1883-1886

@inproceedings{hazen94_icslp,
  author={Timothy J. Hazen and Victor W. Zue},
  title={{Recent improvements in an approach to segment-based automatic language identification}},
  year=1994,
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},
  pages={1883--1886}
}