EUROSPEECH 2003 - INTERSPEECH 2003
Young speakers are not represented adequately in current speech recognizers. In this paper we focus on the problem to adapt the acoustic frontend of a speech recognizer which has been trained on adults' speech to achieve a better performance on speech from children. We introduce and evaluate a method to perform non-linear VTLN by an unconstrained data-driven optimization of the filterbank. A second approach normalizes the speaking rate of the young speakers with the PSOLA algorithm. Significant reductions in word error rate have been achieved.
Bibliographic reference. Stemmer, Georg / Hacker, Christian / Steidl, Stefan / Nöth, Elmar (2003): "Acoustic normalization of children's speech", In EUROSPEECH-2003, 1313-1316.