Adaptive Multichannel Dereverberation for Automatic Speech Recognition

Joe Caroselli, Izhak Shafran, Arun Narayanan, Richard Rose


Reverberation is known to degrade the performance of automatic speech recognition (ASR) systems dramatically in far-field conditions. Adopting the weighted prediction error (WPE) approach, we formulate an online dereverberation algorithm for a multi-microphone array. The key contributions of this paper are: (a) we demonstrate that dereverberation using WPE improves performance even when the acoustic models are trained using multi-style training (MTR) with noisy, reverberated speech; (b) we show that the gains from WPE are preserved even in large and diverse real-world data sets; (c) we propose an adaptive version for online multichannel ASR tasks which gives similar gains as the non-causal version; and (d) while the algorithm can just be applied for evaluation, we show that also including dereverberation during training gives increased performance gains. We also report how different parameter settings of the dereverberation algorithm impacts the ASR performance.


 DOI: 10.21437/Interspeech.2017-1791

Cite as: Caroselli, J., Shafran, I., Narayanan, A., Rose, R. (2017) Adaptive Multichannel Dereverberation for Automatic Speech Recognition. Proc. Interspeech 2017, 3877-3881, DOI: 10.21437/Interspeech.2017-1791.


@inproceedings{Caroselli2017,
  author={Joe Caroselli and Izhak Shafran and Arun Narayanan and Richard Rose},
  title={Adaptive Multichannel Dereverberation for Automatic Speech Recognition},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={3877--3881},
  doi={10.21437/Interspeech.2017-1791},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1791}
}