9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Improvement of Eigenvoice-Based Speaker Adaptation by Parameter Space Clustering

Shutaro Tanji (1), Koichi Shinoda (1), Sadaoki Furui (1), Antonio Ortega (2)

(1) Tokyo Institute of Technology, Japan; (2) University of Southern California, USA

The segmental eigenvoice method has been proposed to provide rapid speaker adaptation with limited amounts of adaptation data. In this method, the speaker-vector space is clustered to several subspaces and PCA is applied to each of the resulting subspaces. In this paper, we propose two new techniques to improve the performance of this segmental eigenvoice approach. First, we propose a soft-clustering method in which each element in a speaker vector can be assigned to more than one cluster. Second, those elements far apart from any of the clusters are removed. Our experiments using the JNAS and S-JNAS databases show that the proposed method outperforms both the original eigenvoice and the segmental eigenvoice methods, e.g., 3.3% average improvement when only 10 utterances are used for adaptation.

Full Paper

Bibliographic reference.  Tanji, Shutaro / Shinoda, Koichi / Furui, Sadaoki / Ortega, Antonio (2008): "Improvement of eigenvoice-based speaker adaptation by parameter space clustering", In INTERSPEECH-2008, 1229-1232.