Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

On Combining Vocal Tract Length Normalisation and Speaker Adaptation for Noise Robust Speech Recognition

Ramalingam Hariharan, Olli Viikki

Nokia Research Center, Speech and Audio Systems Laboratory, Tampere, Finland

This paper investigates the combination of vocal tract length normalisation and speaker adaptation in con-nected digit recognition. In particular, we focus on performing this task under a continuously varying car noise environment. Continuous supervised speaker and environment adaptation is carried out on the test data according to the Bayesian framework. The paper also evaluates various approaches to implement vocal tract length normalisation. The best performance was obtained when the normalisation was performed during both initial speaker-independent training and testing. It was also noticed that, during testing, speaker specific normalisation produced better results than utterance specific normalisation. Our experimental results on the connected digit database show that the joint approach outperforms the system in which on-line Bayesian speaker adaptation is performed on HMM mean parameters. The performance gain was particularly high with so called outlier speakers for whom adaptation is truly needed.

