12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

An Efferent-Inspired Auditory Model Front-End for Speech Recognition

Chia-ying Lee (1), James Glass (1), Oded Ghitza (2)

(1) MIT, USA
(2) Boston University, USA

In this paper, we investigate a closed-loop auditory model and explore its potential as a feature representation for speech recognition. The closed-loop representation consists of an auditory-based, efferent-inspired feedback mechanism that regulates the operating point of a filter bank, thus enabling it to dynamically adapt to changing background noise. With dynamic adaptation, the closedloop representation demonstrates an ability to compensate for the effects of noise on speech, and generates a consistent feature representation for speech when contaminated by different kinds of noises. Our preliminary experimental results indicate that the efferent-inspired feedback mechanism enables the closed-loop auditory model to consistently improve word recognition accuracies, when compared with an open-loop representation, for mismatched training and test noise conditions in a connected digit recognition task.

