ISCA Archive CHiME 2018
ISCA Archive CHiME 2018

The ZTSpeech system for CHiME-5 Challenge: A far-field speech recognition system with front-end and robust back-end

Chenxing Li, Tieqiang Wang

In this paper, we describe our ZTSpeech for two tracks of CHiME-5 challenge. For front-end, our experiments conduct the comparisons between several popular beamforming methods. Besides, we also propose a omnidirectional minimum variance distortionless response (OMVDR) followed by weighted prediction error (WPE). Furthermore, we investigate the impact of data augmentation and data combinations. For back-end, several acoustic models (AMs) with different architectures are deeply investigated. N-gram-based and recurrent neural network (RNN)-based language models (LMs) are both evaluated. For single-array track, by combining the most effective approaches, our final system can achieve 11.94% promotion on performance in evaluation set, from 73.27% to 61.33%. For multiple-array track, our final system can achieve 12.29% improvement in evaluation set, from 73.30% to 61.01%.


doi: 10.21437/CHiME.2018-13

Cite as: Li, C., Wang, T. (2018) The ZTSpeech system for CHiME-5 Challenge: A far-field speech recognition system with front-end and robust back-end. Proc. 5th International Workshop on Speech Processing in Everyday Environments (CHiME 2018), 58-63, doi: 10.21437/CHiME.2018-13

@inproceedings{li18_chime,
  author={Chenxing Li and Tieqiang Wang},
  title={{The ZTSpeech system for CHiME-5 Challenge: A far-field speech recognition system with front-end and robust back-end}},
  year=2018,
  booktitle={Proc. 5th International Workshop on Speech Processing in Everyday Environments (CHiME 2018)},
  pages={58--63},
  doi={10.21437/CHiME.2018-13}
}