INTERSPEECH 2008
9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Adaptive Training Using Discriminative Mapping Transforms

C. K. Raut, K. Yu, M. J. F. Gales

University of Cambridge, UK

Speaker adaptive training (SAT) is a useful technique for building speech recognition systems on non-homogeneous data. When combining SAT with discriminative training criteria, maximum likelihood (ML) transforms are often used for unsupervised adaptation tasks. This is because discriminatively estimated transforms are highly sensitive to errors in the supervision hypothesis. In this paper, speaker adaptive training based on discriminative mapping transforms (DMTs) is proposed. DMTs are speaker-independent discriminative transforms that are applied to ML-estimated speaker-specific transforms. As DMTs are estimated during training, they are not affected by errors in the supervision hypothesis. The proposed method was evaluated on an English conversational telephone speech task. It was found to significantly outperform the standard discriminative SAT schemes.

Full Paper

Bibliographic reference.  Raut, C. K. / Yu, K. / Gales, M. J. F. (2008): "Adaptive training using discriminative mapping transforms", In INTERSPEECH-2008, 1697-1700.