12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Mapping Sparse Representation to State Likelihoods in Noise-Robust Automatic Speech Recognition

Katariina Mahkonen (1), Antti Hurmalainen (1), Tuomas Virtanen (1), Jort F. Gemmeke (2)

(1) Tampere University of Technology, Finland
(2) Radboud Universiteit Nijmegen, The Netherlands

This paper proposes learning-based methods for mapping a sparse representation of noisy speech to state likelihoods in an automatic speech recognition system. We represent speech as a sparse linear combination of exemplars extracted from training data. The weights of exemplars are mapped to speech state likelihoods using Ordinary Least Squares (OLS) and Partial Least Squares (PLS) regression. Recognition experiments are conducted using the CHiME noisy speech database. According to the results, both algorithms can be successfully used for training the mapping. We achieve improvements over the previous binary labeling system, and recognition scores close to 70% at -6 dB SNR.

Full Paper

Bibliographic reference.  Mahkonen, Katariina / Hurmalainen, Antti / Virtanen, Tuomas / Gemmeke, Jort F. (2011): "Mapping sparse representation to state likelihoods in noise-robust automatic speech recognition", In INTERSPEECH-2011, 465-468.