11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Speaker Adaptation Based on Nonlinear Spectral Transform for Speech Recognition

Toyohiro Hayashi, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda

Nagoya Institute of Technology, Japan

This paper proposes a speaker adaptation technique using nonlinear spectral transform based on GMMs. One of the most popular forms of speaker adaptation is based on linear transforms, e.g., MLLR. Although MLLR uses multiple transforms according to regression classes, only a single linear transform is applied to each state. The proposed method performs nonlinear speaker adaptation based on a new likelihood function combining HMMs for recognition with GMMs for spectral transform. Moreover, the context dependency of transforms can also be estimated in the integrated ML fashion. In phoneme recognition experiments, the proposed technique shows better performance than the conventional approaches.

Full Paper

Bibliographic reference.  Hayashi, Toyohiro / Nankaku, Yoshihiko / Lee, Akinobu / Tokuda, Keiichi (2010): "Speaker adaptation based on nonlinear spectral transform for speech recognition", In INTERSPEECH-2010, 542-545.