Machine Listening in Multisource Environments (CHiME) 2011

Florence, Italy
September 1, 2011

Robust Automatic Speech Recognition through On-Line Semi Blind Source Extraction

Francesco Nesta, Marco Matassoni

Fondazione Bruno Kessler-Irst, Trento, Italy

This paper describes the system used to process the data of the CHiME Pascal 2011 competition, whose goal is to separate the desired speech and recognize the commands being spoken. The binaural recorded mixtures are processed by an on-line Semi- Blind Source Extraction algorithm. The algorithm is based on a multi-stage architecture combining the advantages of constrained Independent Component Analysis and Wiener-based processing, allowing the estimation of the target signal with limited distortion. The recovered target signal is then fed to the recognizer which uses noise robust features based on Gammatone Frequency Cepstral Coefficients. Moreover, model adaptation to actual processing is applied as a further stage to reduce the acoustic mismatch. Performance comparison between different model/algorithmic settings is reported for both development and test data sets.

Index Terms. blind source separation, speech enhancement, robust speech recognition

