Eigenvoice is an effective speaker adaptation approach and capable of balancing the performance and the requirement for a large amount of adaptation data. However, the conventional Maximum Likelihood Eigen-Decomposition (MLED) method in eigenvoice adaptation is based on Maximum Likelihood (ML) criterion and suffers from the unrealistic assumption made by HMM on speech process, so alternative schemes may be more effective to improve the performance. In this paper, we propose a new discriminative adaptation algorithm called Maximum Mutual Information Eigen- Decomposition (MMIED) in which the mutual information between the training word sequences and the observation sequences is maximized. By the use of word lattice, the competing word hypotheses are taken into account to make the estimation more discriminative. MLED, MMIED and Maximum a Posteriori Eigen-Decomposition (MAPED) which is based on Maximum a Posteriori (MAP) criterion were all experimented to give a comprehensive comparison. Results showed that MMIED outperformed both MLED and MAPED.
Cite as: Luo, J., Ou, Z., Wang, Z. (2005) Discriminative speaker adaptation with eigenvoices. Proc. Interspeech 2005, 1805-1808, doi: 10.21437/Interspeech.2005-168
@inproceedings{luo05_interspeech, author={Jun Luo and Zhijian Ou and Zuoying Wang}, title={{Discriminative speaker adaptation with eigenvoices}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={1805--1808}, doi={10.21437/Interspeech.2005-168} }