5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Eigenvoices for Speaker Adaptation

Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua, Lloyd Goldwasser, Nancy Niedzielski, Steven Fincke, Ken Field, Matteo Contolini

Panasonic Technologies Inc., Speech Technology Laboratory, USA

We have devised a new class of fast adaptation techniques for speech recognition. These techniques are based on prior knowledge of speaker variation, obtained by applying Principal Component Analysis (PCA) or a similar technique to T vectors of dimension D derived from T speaker-dependent models. This offline step yields T basis vectors called ``eigenvoices''. We constrain the model for new speaker S to be located in the space spanned by the first K eigenvoices. Speaker adaptation involves estimating the K eigenvoice coefficients for the new speaker; typically, K is very small compared to D. We conducted mean adaptation experiments on the Isolet database. With a large amount of supervised adaptation data, most eigenvoice techniques performed slightly better than MAP or MLLR; with small amounts of supervised adaptation data or for unsupervised adaptation, some eigenvoice techniques performed much better. We believe that the eigenvoice approach would yield rapid adaptation for most speech recognition systems.

Full Paper

Bibliographic reference.  Kuhn, Roland / Nguyen, Patrick / Junqua, Jean-Claude / Goldwasser, Lloyd / Niedzielski, Nancy / Fincke, Steven / Field, Ken / Contolini, Matteo (1998): "Eigenvoices for speaker adaptation", In ICSLP-1998, paper 0303.