EUROSPEECH 2003 - INTERSPEECH 2003
The recognition performance of automatic speech recognition systems can be improved by reducing the mismatch between training and test data during feature extraction. The approach described in this paper is based on estimating the signal's cumulative density functions on the filter bank using a small number of quantiles. A two-step transformation is then applied to reduce the difference between these quantiles and the ones estimated on the training data. The first step is a power function transformation applied to each individual filter channel, followed by a linear combination of neighboring filters. On the Aurora 4 16kHz database the average word error rates could be reduced from 60.8% to 37.6% (clean training) and from 38.0% to 31.5% (multi condition training).
Bibliographic reference. Hilger, Florian / Ney, Hermann (2003): "Evaluation of quantile based histogram equalization with filter combination on the Aurora 3 and 4 databases", In EUROSPEECH-2003, 341-344.