We propose a fast speaker adaptation method using an aspect model. The performance of speaker independent (SI) model is very sensitive to environments such as microphones, speakers, and noises. Speaker adaptation techniques try to obtain near speaker dependent (SD) performance with only small amounts of specific data and are often based on initial SI model. One of the most important purposes for adaptation algorithms is to modify a large number of parameters with only a small amount of adaptation data. The number of free parameters to be estimated from adaptation data can be reduced by using aspect model. In this paper, we introduce an aspect model into an acoustic model for rapid speaker adaptation. A formulation of probabilistic latent semantic analysis (PLSA) is extended to continuous density HMM. We carried out an isolated word recognition experiment on Korean database, and the results are compared to those of conventional expectation maximization (EM) algorithm, maximum a posteriori (MAP) and maximum likelihood linear regression (MLLR).
Bibliographic reference. Hahm, Seongjun / Ito, Akinori / Makino, Shozo / Suzuki, Motoyuki (2008): "A fast speaker adaptation method using aspect model", In INTERSPEECH-2008, 1221-1224.