Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Exhaustive Search for Lower-Bound Error-Rates in Vocal Tract Length Normalization

Hans Dolfing

Philips Research Laboratories, Aachen, Germany

In the context of large-vocabulary, continuous speech recognition, we address the problem of speaker normalization. In particular, we address the main drawback of many vocal tract length normalization (VTLN) studies and explore the relation between achieved and potential error-rate reduction. In other words, we investigate the correlations between the estimated and optimal warping factors. In addition, we compare the merits of maximum-likelihood VTLN and fast VTLN with the best possible, optimal error-rate for every approach. The experimental results include achieved error-rate reductions of 13.5% and a potential error-rate reduction of about 20%. We show that maximum-likelihood VTLN achieves 90% of the potential, speaker-based, error-rate reduction.

