ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Fast speaker adaptive training for speech recognition

Daniel Povey, Hong-Kwang Jeff Kuo, Hagen Soltau

In this paper we describe various fast and convenient implementations of Speaker Adaptive Training (SAT) for use in training when Maximum Likelihood Linear Regression (MLLR) is to be used in test time to adapt Gaussian means. The memory and disk requirements for most of these are similar to those for normal ML training; the computation in all cases is dominated by the need to compute the MLLR transforms. Commonly MLLR is combined with Constrained MLLR (CMLLR) which can be viewed as a feature space affine transform and has its own form of SAT (we will call this CMLLR-SAT); we experiment with combining the two forms of SAT. We find that even on top of CMLLR-SAT, MLLR-SAT gives improvements.

doi: 10.21437/Interspeech.2008-377

Cite as: Povey, D., Kuo, H.-K.J., Soltau, H. (2008) Fast speaker adaptive training for speech recognition. Proc. Interspeech 2008, 1245-1248, doi: 10.21437/Interspeech.2008-377

  author={Daniel Povey and Hong-Kwang Jeff Kuo and Hagen Soltau},
  title={{Fast speaker adaptive training for speech recognition}},
  booktitle={Proc. Interspeech 2008},