ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

TRAP language identification system for RATS phase II evaluation

Kyu J. Han, Sriram Ganapathy, Ming Li, Mohamed K. Omar, Shrikanth Narayanan

Automatic language identification or detection of audio data has become an important preprocessing step for speech/speaker recognition and audio data mining. In many surveillance applications, language detection has to be performed on highly degraded audio inputs. In this paper, we present our work on language detection in highly degraded radio channel scenarios. We provide a brief description of the Targeted Robust Audio Processing (TRAP) language detection system built for the Phase II Evaluation of the Robust Automatic Transcription of Speech (RATS) program. This system is a combination of 15 systems with different frontends and speech activity decisions. We also analyze the usefulness of multi-layer perceptron (MLP) based non-linear projection of i-vectors before SVM classification. The proposed backend reduces the Equal Error Rate (EER) by 11%.25% relative compared to the baseline PCA-based feature representation for SVM classification, on the RATS test data consisting of data from eight high-frequency radio communication channels.


doi: 10.21437/Interspeech.2013-388

Cite as: Han, K.J., Ganapathy, S., Li, M., Omar, M.K., Narayanan, S. (2013) TRAP language identification system for RATS phase II evaluation. Proc. Interspeech 2013, 1502-1506, doi: 10.21437/Interspeech.2013-388

@inproceedings{han13b_interspeech,
  author={Kyu J. Han and Sriram Ganapathy and Ming Li and Mohamed K. Omar and Shrikanth Narayanan},
  title={{TRAP language identification system for RATS phase II evaluation}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={1502--1506},
  doi={10.21437/Interspeech.2013-388}
}