7th International Conference on Spoken Language Processing
September 16-20, 2002
In this paper, we expand on a previously proposed algorithm entitled Structural Maximum Likelihood Eigenspace Mapping (SMLEM) [5, 6] for rapid speaker adaptation by exploring a variety of model clustering methods and incorporating a multi-stream approach. The SMLEM algorithm directly adapts speaker independent acoustic models to a test speaker by mapping the mixture Gaussian components from a speaker independent eigenspace to speaker dependent eigenspaces in a maximum likelihood manner, with very limited amounts of adaptation data. Evaluations are performed using the WSJ Spoke3 corpus. Employing the improved proposed methods, SMLEM consistently outperforms both standard MLLR and block diagonal MLLR for small amounts of adaptation data.
Bibliographic reference. Zhou, Bowen / Hansen, John H. L. (2002): "Improved structural maximum likelihood eigenspace mapping for rapid speaker adaptation", In ICSLP-2002, 1433-1436.