Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Discriminative Speaker Adaptation with Eigenvoices

Jun Luo, Zhijian Ou, Zuoying Wang

Tsinghua University, Beijing, China

Eigenvoice is an effective speaker adaptation approach and capable of balancing the performance and the requirement for a large amount of adaptation data. However, the conventional Maximum Likelihood Eigen-Decomposition (MLED) method in eigenvoice adaptation is based on Maximum Likelihood (ML) criterion and suffers from the unrealistic assumption made by HMM on speech process, so alternative schemes may be more effective to improve the performance. In this paper, we propose a new discriminative adaptation algorithm called Maximum Mutual Information Eigen- Decomposition (MMIED) in which the mutual information between the training word sequences and the observation sequences is maximized. By the use of word lattice, the competing word hypotheses are taken into account to make the estimation more discriminative. MLED, MMIED and Maximum a Posteriori Eigen-Decomposition (MAPED) which is based on Maximum a Posteriori (MAP) criterion were all experimented to give a comprehensive comparison. Results showed that MMIED outperformed both MLED and MAPED.

Full Paper

Bibliographic reference.  Luo, Jun / Ou, Zhijian / Wang, Zuoying (2005): "Discriminative speaker adaptation with eigenvoices", In INTERSPEECH-2005, 1805-1808.