This paper presents a novel approach to combine microphone array processing and robust speech recognition for reverberant multi-speaker environments. Spatial cues are extracted from a microphone array and automatically clustered to estimate localization masks in the time-frequency domain. The localization masks are then used to blindly design adaptive filters in order to enhance the source signals prior to missing data speech recognition. A novel evidence model better exploiting the information provided by the source separation stage is proposed. Recognition experiments demonstrate the effectiveness of the scheme when compared to traditional microphone array enhancement and a related binaural separation model.
Bibliographic reference. Kühne, Marco / Togneri, Roberto / Nordholm, Sven (2008): "Adaptive beamforming and soft missing data decoding for robust speech recognition in reverberant environments", In INTERSPEECH-2008, 976-979.