Reverberation leads to high word error rates (WERs) for automatic speech recognition (ASR) systems. This work presents robust acoustic features motivated by subspace modeling and human speech perception for use in large vocabulary continuous speech recognition (LVCSR). We explore different acoustic modeling strategies and language modeling techniques, and demonstrate that robust features with acoustic modeling based on deep learning can provide significant reduction in WERs in the task of recognizing reverberated speech compared to mel-cepstral features and acoustic modeling based on Gaussian Mixture Models (GMMs).
Bibliographic reference. Mitra, Vikramjit / Hout, Julien Van / McLaren, Mitchell / Wang, Wen / Graciarena, Martin / Vergyri, Dimitra / Franco, Horacio (2015): "Combating reverberation in large vocabulary continuous speech recognition", In INTERSPEECH-2015, 2449-2453.