ISCA Archive CHiME 2018
ISCA Archive CHiME 2018

The NDSC transcription system for the 2018 CHiME-5 Challenge

Dan Qu, Cheng-Ran Liu, Xu-Kiu Yang, Wen-lin Zhang

The National Digital Switching System Engineering and Technological R&D Center (NDSC) speech-to-text transcription system for the 2018 CHiME-5 is described. The time delay neural network (TDNN) and TDNN-long short term memory recurrent neural network (TDNN-LSTM) systems are trained using deep bottleneck features (BNF). Since the audio recordings from parallel worn microphone are available, the third system is trained, in which the alignments of audio recordings from Kinect device are generated from worn microphone audio recordings. At last, the minimum Bayes risk (MBR) combination was utilized to combine different systems and reduce WER further. The WER of our system on develop dataset is 74.61%, leading to a 6% absolute reduction comparing with the baseline system.


doi: 10.21437/CHiME.2018-18

Cite as: Qu, D., Liu, C.-R., Yang, X.-K., Zhang, W.-l. (2018) The NDSC transcription system for the 2018 CHiME-5 Challenge. Proc. 5th International Workshop on Speech Processing in Everyday Environments (CHiME 2018), 81-84, doi: 10.21437/CHiME.2018-18

@inproceedings{qu18_chime,
  author={Dan Qu and Cheng-Ran Liu and Xu-Kiu Yang and Wen-lin Zhang},
  title={{The NDSC transcription system for the 2018 CHiME-5 Challenge}},
  year=2018,
  booktitle={Proc. 5th International Workshop on Speech Processing in Everyday Environments (CHiME 2018)},
  pages={81--84},
  doi={10.21437/CHiME.2018-18}
}