Model compensation is a standard way of improving speech recognisers' robustness to noise. Most model compensation techniques produce diagonal covariances. However, this fails to handle changes in the feature correlations due to the noise. This paper presents a scheme that allows full covariance matrices to be estimated. One problem is that full covariance matrix estimation will be more sensitive to approximations, like those for dynamic parameters which are known to be crude. In this paper a linear transformation of a window of consecutive frames is used as the basis for dynamic parameter compensation. A second problem is that the resulting full covariance matrices slow down decoding. This is addressed by using predictive linear transforms that decorrelate the feature space, so that the decoder can then use diagonal covariance matrices. On a noise-corrupted Resource Management task, the proposed scheme outperformed the standard vts compensation scheme.
Bibliographic reference. Dalen, R. C. van / Gales, M. J. F. (2008): "Covariance modelling for noise-robust speech recognition", In INTERSPEECH-2008, 2000-2003.