An Exploration towards Joint Acoustic Modeling for Indian Languages: IIIT-H Submission for Low Resource Speech Recognition Challenge for Indian Languages, INTERSPEECH 2018

Hari Krishna, Krishna Gurugubelli, Vishnu Vidyadhara Raju V, Anil Kumar Vuppala


India being a multilingual society, a multilingual automatic speech recognition system (ASR) is widely appreciated. Despite different orthographies, Indian languages share same phonetic space. To exploit this property, a joint acoustic model has been trained for developing multilingual ASR system using a common phone-set. Three Indian languages namely Telugu, Tamil and, Gujarati are considered for the study. This work studies the amenability of two different acoustic modeling approaches for training a joint acoustic model using common phone-set. Sub-space Gaussian mixture models (SGMM) and recurrent neural networks (RNN) trained with connectionist temporal classification (CTC) objective function are explored for training joint acoustic models. From the experimental results, it can be observed that the joint acoustic models trained with RNN-CTC have performed better than SGMM system even on 120 hours of data (approx 40 hrs per language). The joint acoustic model trained with RNN-CTC has performed better than monolingual models, due to an efficient data sharing across the languages. Conditioning the joint model with language identity had a minimal advantage. Sub-sampling the features by a factor of 2 while training RNN-CTC models has reduced the training times and has performed better.


 DOI: 10.21437/Interspeech.2018-1584

Cite as: Krishna, H., Gurugubelli, K., V, V.V.R., Vuppala, A.K. (2018) An Exploration towards Joint Acoustic Modeling for Indian Languages: IIIT-H Submission for Low Resource Speech Recognition Challenge for Indian Languages, INTERSPEECH 2018. Proc. Interspeech 2018, 3192-3196, DOI: 10.21437/Interspeech.2018-1584.


@inproceedings{Krishna2018,
  author={Hari Krishna and Krishna Gurugubelli and Vishnu Vidyadhara Raju V and Anil Kumar Vuppala},
  title={An Exploration towards Joint Acoustic Modeling for Indian Languages: IIIT-H Submission for Low Resource Speech Recognition Challenge for Indian Languages, INTERSPEECH 2018},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={3192--3196},
  doi={10.21437/Interspeech.2018-1584},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1584}
}