Transfer Learning based Non-native Acoustic Modeling for Pronunciation Error Detection

Richeng Duan, Tatsuya Kawahara, Masatake Dantsuji, Hiroaki Nanjo


The scarcity of large-scale non-native corpora and human annotations are two fundamental challenges in the development of computer-assisted pronunciation training (CAPT) systems. We explored several transfer learning based methods to detect the pronunciation errors without using non-native training data. Effects were confirmed in the Mandarin Chinese pronunciation error detection of Japanese speakers. In this paper, we investigate the generality of the methods through application to an English speech data of Japanese speakers. We also evaluate on a non-native phone recognition experiment, which is necessary but challenging in advanced CAPT systems. Experimental results show that transfer learning based acoustic modeling methods can not only be ported to a new target language but also effective in a recognition task.


 DOI: 10.21437/SLaTE.2017-8

Cite as: Duan, R., Kawahara, T., Dantsuji, M., Nanjo, H. (2017) Transfer Learning based Non-native Acoustic Modeling for Pronunciation Error Detection. Proc. 7th ISCA Workshop on Speech and Language Technology in Education, 42-46, DOI: 10.21437/SLaTE.2017-8.


@inproceedings{Duan2017,
  author={Richeng Duan and Tatsuya Kawahara and Masatake Dantsuji and Hiroaki Nanjo},
  title={ Transfer Learning based Non-native Acoustic Modeling for Pronunciation Error Detection},
  year=2017,
  booktitle={Proc. 7th ISCA Workshop on Speech and Language Technology in Education},
  pages={42--46},
  doi={10.21437/SLaTE.2017-8},
  url={http://dx.doi.org/10.21437/SLaTE.2017-8}
}