ISCA Archive CHiME 2018
ISCA Archive CHiME 2018

The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays

Naoyuki Kanda, Rintaro Ikeshita, Shota Horiguchi, Yusuke Fujita, Kenji Nagamatsu, Xiaofei Wang, Vimal Manohar, Nelson Enrique Yalta Soplin, Matthew Maciejewski, Szu-Jui Chen, Aswin Shanmugam Subramanian, Ruizhi Li, Zhiqi Wang, Jason Naradowsky, L. Paola Garcia-Perera, Gregory Sell

This paper presents Hitachi and JHU’s efforts on developing CHiME-5 system to recognize dinner party speeches recorded by multiple microphone arrays. We newly developed (1) the way to apply multiple data augmentation methods, (2) residual bidirectional long short-term memory, (3) 4-ch acoustic models, (4) multiple-array combination methods, (5) hypothesis deduplication method, and (6) speaker adaptation technique of neural beamformer. As the results, our best system in category B achieved 52.38% of word error rates (WERs) for development set, which corresponded to 35% of relative WER reduction from the state-of-the-art baseline. Our best system also achieved 48.20% of WER for evaluation set, which was the 2nd best result in the CHiME-5 competition.


doi: 10.21437/CHiME.2018-2

Cite as: Kanda, N., Ikeshita, R., Horiguchi, S., Fujita, Y., Nagamatsu, K., Wang, X., Manohar, V., Yalta Soplin, N.E., Maciejewski, M., Chen, S.-J., Subramanian, A.S., Li, R., Wang, Z., Naradowsky, J., Garcia-Perera, L.P., Sell, G. (2018) The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays. Proc. 5th International Workshop on Speech Processing in Everyday Environments (CHiME 2018), 6-10, doi: 10.21437/CHiME.2018-2

@inproceedings{kanda18_chime,
  author={Naoyuki Kanda and Rintaro Ikeshita and Shota Horiguchi and Yusuke Fujita and Kenji Nagamatsu and Xiaofei Wang and Vimal Manohar and Nelson Enrique {Yalta Soplin} and Matthew Maciejewski and Szu-Jui Chen and Aswin Shanmugam Subramanian and Ruizhi Li and Zhiqi Wang and Jason Naradowsky and L. Paola Garcia-Perera and Gregory Sell},
  title={{The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays}},
  year=2018,
  booktitle={Proc. 5th International Workshop on Speech Processing in Everyday Environments (CHiME 2018)},
  pages={6--10},
  doi={10.21437/CHiME.2018-2}
}