Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Model-based Speaker Normalization Methods for Speech Recognition

Masaki Naito, Li Deng, Yoshinori Sagisaka

ATR Interpreting Telecommunications Research Labs., Seika-cho, Soraku-gun, Kyoto, Japan

We address the problem of how vocal-tract (VT) parameters and and the related VT geometric model can be used effectively to normalize the speech acoustic properties of the speakers. The problem is important since speaker variability is one major obstacle to high-accuracy speech recognition and use of VT parameters offers a natural way to account for such a variability. The data-driven methods used in the conventional technology for speaker adaptation requires a large amount of adaptation data, but our experimental results show the new model-based speaker normalization method described in this paper is superior in performance while drastically reducing the amount of adaptation data needed to normalize speakers.

Full Paper (PDF)

Bibliographic reference.  Naito, Masaki / Deng, Li / Sagisaka, Yoshinori (1999): "Model-based speaker normalization methods for speech recognition", In EUROSPEECH'99, 2515-2518.