ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Discriminative speaker adaptation with eigenvoices

Jun Luo, Zhijian Ou, Zuoying Wang

Eigenvoice is an effective speaker adaptation approach and capable of balancing the performance and the requirement for a large amount of adaptation data. However, the conventional Maximum Likelihood Eigen-Decomposition (MLED) method in eigenvoice adaptation is based on Maximum Likelihood (ML) criterion and suffers from the unrealistic assumption made by HMM on speech process, so alternative schemes may be more effective to improve the performance. In this paper, we propose a new discriminative adaptation algorithm called Maximum Mutual Information Eigen- Decomposition (MMIED) in which the mutual information between the training word sequences and the observation sequences is maximized. By the use of word lattice, the competing word hypotheses are taken into account to make the estimation more discriminative. MLED, MMIED and Maximum a Posteriori Eigen-Decomposition (MAPED) which is based on Maximum a Posteriori (MAP) criterion were all experimented to give a comprehensive comparison. Results showed that MMIED outperformed both MLED and MAPED.

doi: 10.21437/Interspeech.2005-168

Cite as: Luo, J., Ou, Z., Wang, Z. (2005) Discriminative speaker adaptation with eigenvoices. Proc. Interspeech 2005, 1805-1808, doi: 10.21437/Interspeech.2005-168

  author={Jun Luo and Zhijian Ou and Zuoying Wang},
  title={{Discriminative speaker adaptation with eigenvoices}},
  booktitle={Proc. Interspeech 2005},