Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition

Pengcheng Guo, Haihua Xu, Lei Xie, Eng Siong Chng


In this paper, we present our overall efforts to improve the performance of a code-switching speech recognition system using semi-supervised training methods from lexicon learning to acoustic modeling, on the South East Asian Mandarin-English (SEAME) data. We first investigate semi-supervised lexicon learning approach to adapt the canonical lexicon, which is meant to alleviate the heavily accented pronunciation issue within the code-switching conversation of the local area. As a result, the learned lexicon yields improved performance. Furthermore, we attempt to use semi-supervised training to deal with those transcriptions that are highly mismatched between human transcribers and ASR system. Specifically, we conduct semi-supervised training assuming those poorly transcribed data as unsupervised data. We found the semi-supervised acoustic modeling can lead to improved results. Finally, to make up for the limitation of the conventional n-gram language models due to data sparsity issue, we perform lattice rescoring using neural network language models and significant WER reduction is obtained.


 DOI: 10.21437/Interspeech.2018-1974

Cite as: Guo, P., Xu, H., Xie, L., Chng, E.S. (2018) Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition. Proc. Interspeech 2018, 1928-1932, DOI: 10.21437/Interspeech.2018-1974.


@inproceedings{Guo2018,
  author={Pengcheng Guo and Haihua Xu and Lei Xie and Eng Siong Chng},
  title={Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={1928--1932},
  doi={10.21437/Interspeech.2018-1974},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1974}
}