INTERSPEECH 2007
8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Discriminative MCE-Based Speaker Adaptation of Acoustic Models for a Spoken Lecture Processing Task

Timothy J. Hazen (1), Erik McDermott (2)

(1) MIT, USA
(2) NTT Corporation, Japan

This paper investigates the use of minimum classification error (MCE) training in conjunction with speaker adaptation for the large vocabulary speech recognition task of lecture transcription. Emphasis is placed on the case of supervised adaptation, though an examination of the unsupervised case is also conducted. This work builds upon our previous work using MCE training to construct speaker independent acoustic models. In this work we explore strategies for incorporating MCE training into a model interpolation adaptation scheme in the spirit of traditional maximum a posteriori probability (MAP) adaptation. Experiments show relative error rate reductions between 3% and 7% over a baseline system which uses standard ML estimation instead of MCE training during the adaptation phase.

Full Paper

Bibliographic reference.  Hazen, Timothy J. / McDermott, Erik (2007): "Discriminative MCE-based speaker adaptation of acoustic models for a spoken lecture processing task", In INTERSPEECH-2007, 1577-1580.