The LeVoice Far-Field Speech Recognition System for VOiCES from a Distance Challenge 2019

Yulong Liang, Lin Yang, Xuyang Wang, Yingjie Li, Chen Jia, Junjie Wang


This paper describes our submission to the “VOiCES from a Distance Challenge 2019”, which is designed to foster research in the area of speaker recognition and automatic speech recognition (ASR) with a special focus on single channel distant/far-field audio under noisy conditions. We focused on the ASR task under a fixed condition in which the training data was clean and small, but the development data and test data were noisy and unmatched. Thus we developed the following major technical points for our system, which included data augmentation, weighted-prediction-error based speech enhancement, acoustic models based on different networks, TDNN or LSTM based language model rescore, and ROVER. Experiments on the development set and the evaluation set showed that the front-end processing, data augmentation and system fusion made the main contributions for the performance increasing, and the final word error rate results based on our system scored 15.91% and 19.6% respectively.


 DOI: 10.21437/Interspeech.2019-1944

Cite as: Liang, Y., Yang, L., Wang, X., Li, Y., Jia, C., Wang, J. (2019) The LeVoice Far-Field Speech Recognition System for VOiCES from a Distance Challenge 2019. Proc. Interspeech 2019, 2483-2487, DOI: 10.21437/Interspeech.2019-1944.


@inproceedings{Liang2019,
  author={Yulong Liang and Lin Yang and Xuyang Wang and Yingjie Li and Chen Jia and Junjie Wang},
  title={{The LeVoice Far-Field Speech Recognition System for VOiCES from a Distance Challenge 2019}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={2483--2487},
  doi={10.21437/Interspeech.2019-1944},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1944}
}