Sixth International Conference on Spoken Language Processing
October 16-20, 2000
Exhaustive Search for Lower-Bound Error-Rates in Vocal Tract Length Normalization
Philips Research Laboratories,
In the context of large-vocabulary, continuous speech recognition,
we address the problem of speaker normalization. In particular,
we address the main drawback of many vocal tract length normalization
(VTLN) studies and explore the relation between achieved
and potential error-rate reduction. In other words, we investigate
the correlations between the estimated and optimal warping factors.
In addition, we compare the merits of maximum-likelihood
VTLN and fast VTLN with the best possible, optimal error-rate
for every approach. The experimental results include achieved
error-rate reductions of 13.5% and a potential error-rate reduction
of about 20%. We show that maximum-likelihood VTLN
achieves 90% of the potential, speaker-based, error-rate reduction.
Dolfing, Hans (2000):
"Exhaustive search for lower-bound error-rates in vocal tract length normalization",
In ICSLP-2000, vol.1, 762-765.