ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Robust feature space adaptation for telephony speech recognition

Xin Lei, Jon Hamaker, Xiaodong He

Speaker adaptation is critical for modern speech recognition systems. Due to the computational and multi-channel model sharing considerations, the use of model adaptation techniques is limited in telephony speech recognition systems. On the other hand, feature space adaptation methods such as feature space maximum likelihood linear regression (fMLLR) are efficient approaches suitable for telephony systems. In this work, we first describe techniques for efficient implementation of online fMLLR adaptation. Then feature space maximum a posteriori linear regression (fMAPLR) is proposed to incorporate prior knowledge for the feature transform estimation and improve the robustness of the conventional fMLLR approach. Experiments on telephony data indicate that fMAPLR is significantly more robust than fMLLR, and outperforms fMLLR especially when the adaptation data is very limited.

doi: 10.21437/Interspeech.2006-268

Cite as: Lei, X., Hamaker, J., He, X. (2006) Robust feature space adaptation for telephony speech recognition. Proc. Interspeech 2006, paper 1743-Tue1A2O.2, doi: 10.21437/Interspeech.2006-268

  author={Xin Lei and Jon Hamaker and Xiaodong He},
  title={{Robust feature space adaptation for telephony speech recognition}},
  booktitle={Proc. Interspeech 2006},
  pages={paper 1743-Tue1A2O.2},