5th International Conference on Spoken Language Processing
We have devised a new class of fast adaptation techniques for speech recognition. These techniques are based on prior knowledge of speaker variation, obtained by applying Principal Component Analysis (PCA) or a similar technique to T vectors of dimension D derived from T speaker-dependent models. This offline step yields T basis vectors called ``eigenvoices''. We constrain the model for new speaker S to be located in the space spanned by the first K eigenvoices. Speaker adaptation involves estimating the K eigenvoice coefficients for the new speaker; typically, K is very small compared to D. We conducted mean adaptation experiments on the Isolet database. With a large amount of supervised adaptation data, most eigenvoice techniques performed slightly better than MAP or MLLR; with small amounts of supervised adaptation data or for unsupervised adaptation, some eigenvoice techniques performed much better. We believe that the eigenvoice approach would yield rapid adaptation for most speech recognition systems.
Bibliographic reference. Kuhn, Roland / Nguyen, Patrick / Junqua, Jean-Claude / Goldwasser, Lloyd / Niedzielski, Nancy / Fincke, Steven / Field, Ken / Contolini, Matteo (1998): "Eigenvoices for speaker adaptation", In ICSLP-1998, paper 0303.