In this paper, an auditory based modulation spectral feature is presented to improve automatic speech recognition performance in presence of room reverberation. The solution is based on extracting features from auditory processing characteristics, specifically gammatone filtering based long-term modulation spectral features to reduce sensitivity to environmental noise and further preserve the important speech intelligibility information in the speech signal essential for ASR. Experiments are performed on Aurora-5 meeting recorder digit task recorded with four different microphones in hands-free mode at a real meeting room. For comparison purposes the recognition results obtained using standard ETSI basic and advanced front-ends and conventional features with standard feature compensation are tested. The experimental results reveal that the proposed features provide reliable and considerable improvements with respect to the state of the art feature extraction techniques.
Bibliographic reference. Maganti, Hari Krishna / Matassoni, Marco (2010): "An auditory based modulation spectral feature for reverberant speech recognition", In INTERSPEECH-2010, 570-573.