Indian Languages ASR: A Multilingual Phone Recognition Framework with IPA Based Common Phone-set, Predicted Articulatory Features and Feature fusion

Manjunath K E, K. Sreenivasa Rao, Dinesh Babu Jayagopi, V Ramasubramanian


In this study, a multilingual phone recognition system for four Indian languages - Kannada, Telugu, Bengali and Odia - is described. International phonetic alphabets are used to derive the transcription. Multilingual Phone Recognition System (MPRS) is developed using the state-of-the-art DNNs. The performance of MPRS is improved using the Articulatory Features (AFs). DNNs are used to predict the AFs for place, manner, roundness, frontness and height AF groups. Further, the MPRS is also developed using oracle AFs and their performance is compared with that of predicted AFs. Oracle AFs are used to set the best performance realizable by AFs predicted from MFCC features by DNNs. In addition to the AFs, we have also explored the use of phone posteriors to further boost the performance of MPRS.We show that oracle AFs by feature fusion with MFCCs offer a remarkably low target of PER of 10.4%, which is 24.7% absolute reduction compared to baseline MPRS with MFCCs alone. The best performing system using predicted AFs has shown 2.8% reduction in absolute PER (8% reduction in relative PER) compared to baseline MPRS.


 DOI: 10.21437/Interspeech.2018-2529

Cite as: K E, M., Rao, K.S., Jayagopi, D.B., Ramasubramanian, V. (2018) Indian Languages ASR: A Multilingual Phone Recognition Framework with IPA Based Common Phone-set, Predicted Articulatory Features and Feature fusion. Proc. Interspeech 2018, 1016-1020, DOI: 10.21437/Interspeech.2018-2529.


@inproceedings{K E2018,
  author={Manjunath {K E} and K. Sreenivasa Rao and Dinesh Babu Jayagopi and V Ramasubramanian},
  title={Indian Languages ASR: A Multilingual Phone Recognition Framework with IPA Based Common Phone-set, Predicted Articulatory Features and Feature fusion},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={1016--1020},
  doi={10.21437/Interspeech.2018-2529},
  url={http://dx.doi.org/10.21437/Interspeech.2018-2529}
}