ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

Training Hybrid Models on Noisy Transliterated Transcripts for Code-Switched Speech Recognition

Matthew Wiesner, Mousmita Sarma, Ashish Arora, Desh Raj, Dongji Gao, Ruizhe Huang, Supreet Preet, Moris Johnson, Zikra Iqbal, Nagendra Goel, Jan Trmal, Leibny Paola García Perera, Sanjeev Khudanpur

In this paper, we describe the JHU-GoVivace submission for subtask 2 (code-switching task) of the Multilingual and Code-switching ASR challenges for low resource Indian languages. We built a hybrid HMM-DNN system with several improvements over the provided baseline in terms of lexical, language, and acoustic modeling. For lexical modeling, we investigate using unified pronunciations and phonesets derived from the baseline lexicon and publicly available Wikipron lexicons in Bengali and Hindi to expand the pronunciation lexicons. We explore several neural network architectures, along with supervised pretraining and multilingual training for acoustic modeling. We also describe how we used large externally crawled web text for language modeling. Since the challenge data contain artefacts such as misalignments, various data cleanup methods are explored, including acoustic-driven pronunciation learning to help discover Indian-accented pronunciations for English words as well as transcribed punctuation. As a result of these efforts, our best systems achieve transliterated WERs of 19.5% and 23.2% on the non-duplicated development sets for Hindi-English and Bengali-English, respectively.


doi: 10.21437/Interspeech.2021-2127

Cite as: Wiesner, M., Sarma, M., Arora, A., Raj, D., Gao, D., Huang, R., Preet, S., Johnson, M., Iqbal, Z., Goel, N., Trmal, J., Perera, L.P.G., Khudanpur, S. (2021) Training Hybrid Models on Noisy Transliterated Transcripts for Code-Switched Speech Recognition. Proc. Interspeech 2021, 2906-2910, doi: 10.21437/Interspeech.2021-2127

@inproceedings{wiesner21_interspeech,
  author={Matthew Wiesner and Mousmita Sarma and Ashish Arora and Desh Raj and Dongji Gao and Ruizhe Huang and Supreet Preet and Moris Johnson and Zikra Iqbal and Nagendra Goel and Jan Trmal and Leibny Paola García Perera and Sanjeev Khudanpur},
  title={{Training Hybrid Models on Noisy Transliterated Transcripts for Code-Switched Speech Recognition}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={2906--2910},
  doi={10.21437/Interspeech.2021-2127}
}