ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

A fast speaker adaptation method using aspect model

Seongjun Hahm, Akinori Ito, Shozo Makino, Motoyuki Suzuki

We propose a fast speaker adaptation method using an aspect model. The performance of speaker independent (SI) model is very sensitive to environments such as microphones, speakers, and noises. Speaker adaptation techniques try to obtain near speaker dependent (SD) performance with only small amounts of specific data and are often based on initial SI model. One of the most important purposes for adaptation algorithms is to modify a large number of parameters with only a small amount of adaptation data. The number of free parameters to be estimated from adaptation data can be reduced by using aspect model. In this paper, we introduce an aspect model into an acoustic model for rapid speaker adaptation. A formulation of probabilistic latent semantic analysis (PLSA) is extended to continuous density HMM. We carried out an isolated word recognition experiment on Korean database, and the results are compared to those of conventional expectation maximization (EM) algorithm, maximum a posteriori (MAP) and maximum likelihood linear regression (MLLR).


doi: 10.21437/Interspeech.2008-371

Cite as: Hahm, S., Ito, A., Makino, S., Suzuki, M. (2008) A fast speaker adaptation method using aspect model. Proc. Interspeech 2008, 1221-1224, doi: 10.21437/Interspeech.2008-371

@inproceedings{hahm08_interspeech,
  author={Seongjun Hahm and Akinori Ito and Shozo Makino and Motoyuki Suzuki},
  title={{A fast speaker adaptation method using aspect model}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={1221--1224},
  doi={10.21437/Interspeech.2008-371}
}