9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Beyond Linear Transforms: Efficient Non-Linear Dynamic Adaptation for Noise Robust Speech Recognition

Steven J. Rennie, Pierre L. Dognin

IBM T.J. Watson Research Center, USA

In this paper, we present new theory and results that combine constrained Maximum Likelihood Linear Regression (MLLR), known as feature space MLLR (fMLLR), a state-of-the-art model adaptation technique, with Dynamic Noise Adaptation (DNA), a state-of-the-art noise adaptation algorithm. We explain how DNA implements a highly non-linear transform on speech model features, and why DNA is better suited for compensating for additive noise than fMLLR. Tests results are presented on the DNA + Aurora II framework, which is based upon a collection of challenging in-car noise recordings, as a function of SNR. The results demonstrate that DNA significantly outperforms block fMLLR on additive noise, and that DNA + fMLLR outperforms the ETSI advanced front-end (AFE) system + fMLLR by a significant margin (over 7% absolute).

Full Paper

Bibliographic reference.  Rennie, Steven J. / Dognin, Pierre L. (2008): "Beyond linear transforms: efficient non-linear dynamic adaptation for noise robust speech recognition", In INTERSPEECH-2008, 1305-1308.