ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Transforming features to compensate speech recogniser models for noise

R. C. van Dalen, F. Flego, M. J. F. Gales

To make speech recognisers robust to noise, either the features or the models can be compensated. Feature enhancement is often fast; model compensation is often more accurate, because it predicts the corrupted speech distribution. It is therefore able, for example, to take uncertainty about the clean speech into account. This paper re-analyses the recently-proposed predictive linear transformations for noise compensation as minimising the kl divergence between the predicted corrupted speech and the adapted models. New schemes are then introduced which apply observation-dependent transformations in the front-end to adapt the back-end distributions. One applies transforms in the exact same manner as the popular minimum mean square error (mmse) feature enhancement scheme, and is as fast. The new method performs better on aurora 2.

doi: 10.21437/Interspeech.2009-373

Cite as: Dalen, R.C.v., Flego, F., Gales, M.J.F. (2009) Transforming features to compensate speech recogniser models for noise. Proc. Interspeech 2009, 2499-2502, doi: 10.21437/Interspeech.2009-373

  author={R. C. van Dalen and F. Flego and M. J. F. Gales},
  title={{Transforming features to compensate speech recogniser models for noise}},
  booktitle={Proc. Interspeech 2009},