ISCA Archive ICSLP 1992
ISCA Archive ICSLP 1992

Speaker adaptation based on transfer vector field smoothing with continuous mixture density HMMs

Kazumi Ohkura, Masahide Sugiyama, Shigeki Sagayama

This paper describes a method of speaker adaptation for continuous mixture density HMMs (CDHMMs). Speaker adaptation in CDHMMs is regarded as a kind of retraining problem where a small amount of training data is available. The "Vector Field Smoothing method (VFS)" is used to deal with the problem of retraining with insufficient training data. "VFS" is applied simultaneously to inter-speaker and speaking-style adaptation. In this paper, the standard speaker is a male and the unknown speakers for adaptation are both one male and one female. When 11 sentences are uttered for adaptation phrase-by-phrase instead of word-by-word, the 23 phoneme recognition rate is 87.4% (none adaptation: 47.3%). The phrase recognition rate for HMM-LR is 85.1% (none adaptation: 21.5%).


doi: 10.21437/ICSLP.1992-106

Cite as: Ohkura, K., Sugiyama, M., Sagayama, S. (1992) Speaker adaptation based on transfer vector field smoothing with continuous mixture density HMMs. Proc. 2nd International Conference on Spoken Language Processing (ICSLP 1992), 369-372, doi: 10.21437/ICSLP.1992-106

@inproceedings{ohkura92_icslp,
  author={Kazumi Ohkura and Masahide Sugiyama and Shigeki Sagayama},
  title={{Speaker adaptation based on transfer vector field smoothing with continuous mixture density HMMs}},
  year=1992,
  booktitle={Proc. 2nd International Conference on Spoken Language Processing (ICSLP 1992)},
  pages={369--372},
  doi={10.21437/ICSLP.1992-106}
}