Deep neural networks (DNN) require large amount of training data to build robust acoustic models for speech recognition tasks. Our work is intended in improving the low-resource language acoustic model to reach a performance comparable to that of a high-resource scenario with the help of data/model parameters from other high-resource languages. We explore transfer learning and distillation methods, where a complex high resource model guides or supervises the training of low resource model. The techniques include (i) multi-lingual framework of borrowing data from high-resource language while training the low-resource acoustic model. The KL divergence based constraints are added to make the model biased towards low-resource language, (ii) distilling knowledge from the complex high-resource model to improve the low-resource acoustic model. The experiments were performed on three Indian languages namely Hindi, Tamil and Kannada. All the techniques gave improved performance and the multi-lingual framework with KL divergence regularization giving the best results. In all the three languages a performance close to or better than high-resource scenario was obtained.
Cite as: Abraham, B., Seeram, T., Umesh, S. (2017) Transfer Learning and Distillation Techniques to Improve the Acoustic Modeling of Low Resource Languages. Proc. Interspeech 2017, 2158-2162, doi: 10.21437/Interspeech.2017-1009
@inproceedings{abraham17b_interspeech, author={Basil Abraham and Tejaswi Seeram and S. Umesh}, title={{Transfer Learning and Distillation Techniques to Improve the Acoustic Modeling of Low Resource Languages}}, year=2017, booktitle={Proc. Interspeech 2017}, pages={2158--2162}, doi={10.21437/Interspeech.2017-1009} }