ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

SRI-B End-to-End System for Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages

Hardik Sailor, Kiran Praveen T, Vikas Agrawal, Abhinav Jain, Abhishek Pandey

This paper describes SRI-B’s end-to-end Automated Speech Recognition (ASR) system proposed for the subtask-1 on multilingual ASR challenges for Indian languages. Our end-to-end (E2E) ASR model is based on the transformer architecture trained by jointly minimizing Connectionist Temporal Classification (CTC) & Cross-Entropy (CE) losses. A conventional multilingual model which is trained by pooling data from multiple languages helps in terms of generalization, but it comes at the expense of performance degradation compared to their monolingual counterparts. In our experiments, a multilingual model is trained by conditioning the input features using a language-specific embedding vector. These language-specific embedding vectors are obtained by training a language classifier using an attention-based transformer architecture, and then considering its bottleneck features as language identification (LID) embeddings. We further adapt the multilingual system with language specific data to reduce the degradation on specific languages. We propose a novel hypothesis elimination strategy based on LID scores and length-normalized probabilities that optimally select the model from the pool of available models. The experimental results show that the proposed multilingual training and hypothesis elimination strategy gives an average 3.02% of relative word error recognition (WER) improvement for the blind set over the challenge hybrid ASR baseline system.


doi: 10.21437/Interspeech.2021-1578

Cite as: Sailor, H., T, K.P., Agrawal, V., Jain, A., Pandey, A. (2021) SRI-B End-to-End System for Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages. Proc. Interspeech 2021, 2456-2460, doi: 10.21437/Interspeech.2021-1578

@inproceedings{sailor21_interspeech,
  author={Hardik Sailor and Kiran Praveen T and Vikas Agrawal and Abhinav Jain and Abhishek Pandey},
  title={{SRI-B End-to-End System for Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={2456--2460},
  doi={10.21437/Interspeech.2021-1578}
}