Factorized Deep Neural Network Adaptation for Automatic Scoring of L2 Speech in English Speaking Tests

Dean Luo, Chunxiao Zhang, Linzhong Xia, Lixin Wang


Speaker adaptation has been shown to be effective on speech recognition and evaluation of L2 speech. However, other factors, such as environments and foreign accents, can affect the speech signal in addition to speakers. Factorizing the speaker, environment and other acoustic factors is crucial in evaluating L2 speech to effectively reduce acoustic mismatch between train and test conditions. In this study, we investigate the effects of deep neural network factorized adaptation techniques on L2 speech assessment in real speaking tests. Through recognition and automatic scoring experiments on L2 speech, we demonstrate that factorized fMLLR and iVector based DNN adaptation can better utilize adaptation data to efficiently adapt to complex speaker and environment conditions. Combining the factored components of iVectors and fMLLR transforms can further improve robustness of DNN models in speech recognition and automatic scoring of L2 speech in dynamic environments.


 DOI: 10.21437/Interspeech.2018-2138

Cite as: Luo, D., Zhang, C., Xia, L., Wang, L. (2018) Factorized Deep Neural Network Adaptation for Automatic Scoring of L2 Speech in English Speaking Tests. Proc. Interspeech 2018, 1656-1660, DOI: 10.21437/Interspeech.2018-2138.


@inproceedings{Luo2018,
  author={Dean Luo and Chunxiao Zhang and Linzhong Xia and Lixin Wang},
  title={Factorized Deep Neural Network Adaptation for Automatic Scoring of L2 Speech in English Speaking Tests},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={1656--1660},
  doi={10.21437/Interspeech.2018-2138},
  url={http://dx.doi.org/10.21437/Interspeech.2018-2138}
}