ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Improvements in language identification on the RATS noisy speech corpus

Jeff Ma, Bing Zhang, Spyros Matsoukas, Sri Harish Mallidi, Feipeng Li, Hynek Hermansky

This paper presents a set of techniques that we used to develop the language identification (LID) system for the second phase of the DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state-of-the-art detection capabilities on audio from highly degraded radio communication channels. We report significant gains due to (a) improved speech activity detection, (b) special handling of training data so as to enhance performance on short duration audio samples, and (c) noise robust feature extraction and normalization methods, including the use of multi-layer perceptron (MLP) based phoneme posteriors. We show that on this type of noisy data, the above techniques provide on average a 27% relative improvement in equal error rate (EER) across several test duration conditions.


doi: 10.21437/Interspeech.2013-40

Cite as: Ma, J., Zhang, B., Matsoukas, S., Mallidi, S.H., Li, F., Hermansky, H. (2013) Improvements in language identification on the RATS noisy speech corpus. Proc. Interspeech 2013, 69-73, doi: 10.21437/Interspeech.2013-40

@inproceedings{ma13_interspeech,
  author={Jeff Ma and Bing Zhang and Spyros Matsoukas and Sri Harish Mallidi and Feipeng Li and Hynek Hermansky},
  title={{Improvements in language identification on the RATS noisy speech corpus}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={69--73},
  doi={10.21437/Interspeech.2013-40}
}