7th International Conference on Spoken Language Processing
September 16-20, 2002
We describe two methods that aim at normalizing acoustic vectors at the filterbank level such that the test data distribution matches the training data distribution. They enhance the histogram normalization technique proposed earlier by taking care of the variable silence fraction for each speaker, and by rotating the feature space. We report a number of recognition tests under minor (different microphones in training and test, telephone data) and major (office vs. car recordings) mismatch conditions. Both methods give superior performance to the basic histogram normalization approach. The overall improvements in word error rate (WER) range between 6% and 85% relative.
Bibliographic reference. Molau, Sirko / Hilger, Florian / Keysers, Daniel / Ney, Hermann (2002): "Enhanced histogram normalization in the acoustic feature space", In ICSLP-2002, 1421-1424.