We propose a new approach for multichannel robust speech recognition. This approach extends the vector Taylor series (VTS)-based feature compensation from the single channel to the multichannel case. Precisely, we use the first order VTS to approximate each of the microphone feature vectors. Afterwards, these features are jointly processed to estimate the acoustic channel and noise statistics via expectation maximization (EM). Experimental results with TI-Digits and measured impulse responses show that the proposed method can achieve significant gains in terms of word recognition accuracy in different noise conditions.
Bibliographic reference. Souden, Mehrez / Kinoshita, Keisuke / Delcroix, Marc / Nakatani, Tomohiro (2011): "A multichannel feature-based processing for robust speech recognition", In INTERSPEECH-2011, 689-692.