5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Speaker Adaptation Based on Pre-Clustering Training Speakers

Yuqing Gao, Mukund Padmanabhan, Michael Picheny

IBM, T. J. Watson Research Center, Yorktown Heights, NY, USA

A new strategy for speaker adaptation is described that is based on: (1) pre-clustering all the speakers in the training set acoustically into clusters; (2) for each speaker cluster, a system is built using the data from the speakers who belong to the cluster; (3) when a test speaker's data is available, we find a subset of these clusters, closest to the test speaker; (4) we transform each of the selected clusters to bring it closer to the test speaker's acoustic space; (5) we build a speaker-adapted model using transformed cluster models. This method solves the problem of excessive storage for the training speaker models [1] , as it is relatively inexpensive to store a model for each cluster. Also as each cluster contains a number of speakers, parameters of the models for each cluster can be robustly estimated. The algorithm has been evaluated on a large vocabulary system and comparied to existing algorithms. The imporvement over existing algorithms such as MLLR [2] is statistically significant.

Full Paper

Bibliographic reference.  Gao, Yuqing / Padmanabhan, Mukund / Picheny, Michael (1997): "Speaker adaptation based on pre-clustering training speakers", In EUROSPEECH-1997, 2091-2094.