In this paper, we describe a new speaker adaptation method based on the matrix-variate distribution of training models. A set of mean vectors of hidden Markov models (HMMs) is assumed to be drawn from the matrix-variate normal distribution, and bases are derived under this assumption. The resulting bases have the same dimension as that of the eigenvoice, thus adaptation can be performed using the same equation. In the isolated-word experiments, the proposed method showed a comparable performance with the eigenvoice in a clean environment, and showed better performance than the eigenvoice in both babble and factory floor noises. The experimental results demonstrated the validity of the matrix-variate normal assumption about the training models, thus the proposed method can be used for rapid speaker adaptation in noise environments.
Bibliographic reference. Jeong, Yongwon / Kim, Young Kuk (2011): "Matrix-variate distribution of training models for robust speaker adaptation", In INTERSPEECH-2011, 1093-1096.