Sixth International Conference on Spoken Language Processing
October 16-20, 2000
High Performance Connected Digit Recognition Through Gender-Dependent Acoustic Modelling and Vocal Tract Length Normalisation
Ramalingam Hariharan, Olli Viikki
Nokia Research Center, Speech and Audio Systems Laboratory, Tampere, Finland
Large inter-speaker variability of speech is one of the major
sources which degrade the performance of state-of-the-art
speech recognition systems. During the recent years, several
methods, including gender-dependent acoustic modelling and
vocal tract length normalisation, have been developed to reduce
this variability. In this paper, we first investigate these two
methods individually and propose how they should be
implemented in real-world speech recognition systems.
Secondly, we show that by combining these two techniques, it is
possible to further reduce the error rate in a connected digit
recognition task under a realistic car noise environment.
Experimental results justify the use of the combined approach.
A 44.1% decrease in string error rate was observed when the
performance of the joint system was compared to the genderindependent
baseline system. The results were also better than
that obtained when using these techniques individually.
Hariharan, Ramalingam / Viikki, Olli (2000):
"High performance connected digit recognition through gender-dependent acoustic modelling and vocal tract length normalisation",
In ICSLP-2000, vol.2, 847-850.