14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Improvements in Language Identification on the RATS Noisy Speech Corpus

Jeff Ma (1), Bing Zhang (1), Spyros Matsoukas (1), Sri Harish Mallidi (2), Feipeng Li (2), Hynek Hermansky (2)

(1) Raytheon BBN Technologies, USA
(2) Johns Hopkins University, USA

This paper presents a set of techniques that we used to develop the language identification (LID) system for the second phase of the DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state-of-the-art detection capabilities on audio from highly degraded radio communication channels. We report significant gains due to (a) improved speech activity detection, (b) special handling of training data so as to enhance performance on short duration audio samples, and (c) noise robust feature extraction and normalization methods, including the use of multi-layer perceptron (MLP) based phoneme posteriors. We show that on this type of noisy data, the above techniques provide on average a 27% relative improvement in equal error rate (EER) across several test duration conditions.

Full Paper

Bibliographic reference.  Ma, Jeff / Zhang, Bing / Matsoukas, Spyros / Mallidi, Sri Harish / Li, Feipeng / Hermansky, Hynek (2013): "Improvements in language identification on the RATS noisy speech corpus", In INTERSPEECH-2013, 69-73.