Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

High Performance Connected Digit Recognition Through Gender-Dependent Acoustic Modelling and Vocal Tract Length Normalisation

Ramalingam Hariharan, Olli Viikki

Nokia Research Center, Speech and Audio Systems Laboratory, Tampere, Finland

Large inter-speaker variability of speech is one of the major sources which degrade the performance of state-of-the-art speech recognition systems. During the recent years, several methods, including gender-dependent acoustic modelling and vocal tract length normalisation, have been developed to reduce this variability. In this paper, we first investigate these two methods individually and propose how they should be implemented in real-world speech recognition systems. Secondly, we show that by combining these two techniques, it is possible to further reduce the error rate in a connected digit recognition task under a realistic car noise environment. Experimental results justify the use of the combined approach. A 44.1% decrease in string error rate was observed when the performance of the joint system was compared to the genderindependent baseline system. The results were also better than that obtained when using these techniques individually.


Full Paper

Bibliographic reference.  Hariharan, Ramalingam / Viikki, Olli (2000): "High performance connected digit recognition through gender-dependent acoustic modelling and vocal tract length normalisation", In ICSLP-2000, vol.2, 847-850.