8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Acoustic Model Adaptation based on Coarse/Fine training of Transfer Vectors and Its Application to a Speaker Adaptation Task

Shinji Watanabe

Atsushi Nakamura; NTT, Japan

In this paper, we propose a novel adaptation technique based on coarse/fine training of transfer vectors. We focus on transfer vector estimation of a Gaussian mean from an initial model to an adapted model. The transfer vector is decomposed into a direction vector and a scaling factor. By using tied-Gaussian class (coarse class) estimation for the direction vector, and by using individual Gaussian class (fine class) estimation for the scaling factor, we can obtain accurate transfer vectors with a small number of parameters. Simple training algorithms for transfer vector estimation are analytically derived using the variational Bayes, maximum a posteriori (MAP) and maximum likelihood methods. Speaker adaptation experiments show that our proposals clearly improve speech recognition performance for any amount of adaptation data, compared with conventional MAP adaptation.

Full Paper

Bibliographic reference.  Watanabe, Shinji (2004): "Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task", In INTERSPEECH-2004, 2933-2936.