We demonstrate a system to integrate adaptive beam-forming and auditory features in order to improve speech recognition accuracy in noisy environments. Adaptive beam-forming based on a microphone array can utilize spatial information to improve the sound recording signal-to-noise ratio (SNR) on a focused speaker for robust speech recognition. Auditory features based on modeling the signal processing functions in the hearing system have shown to largely improve speech recognition accuracy under noisy conditions. According to our experiments, when both adaptive beam-forming and the auditory features are integrated, an absolute gain of more than 50% over a baseline on speech recognition accuracy is achieved when 5dB white noise is added.
Index Terms: adaptive beam-forming, auditory features, robust speech recognition, SNR
Bibliographic reference. Sun, Xie / Li, Qi Peter / Zhu, Manli / Zhou, Qiru (2012): "Integrating adaptive beam-forming and auditory features for robust large vocabulary speech recognition", In INTERSPEECH-2012, 2115-2116.