ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Integrating codebook and utterance information in cepstral statistics normalization techniques for robust speech recognition

Guan-min He, Jeih-weih Hung

Cepstral statistics normalization techniques have been shown to be very successful at improving the noise robustness of speech features. This paper proposes a hybrid-based scheme to achieve a more accurate estimate of the statistical information of features in these techniques. By properly integrating codebook and utterance knowledge, the resulting hybrid-based approach significantly outperforms conventional utterance-based, segmentbased and codebook-based approaches in noisy environments. For the Aurora-2 clean-condition training task, the proposed hybrid codebook/segment-based histogram equalization (CS-HEQ) achieves an average recognition accuracy of 90.66%, which is better than utterance-based HEQ (87.62%), segment-based HEQ (85.92%) and codebook-based HEQ (85.29%). Furthermore, the high-performance CS-HEQ can be implemented with a short delay and can thus be applied in real-time online systems.


doi: 10.21437/Interspeech.2009-357

Cite as: He, G.-m., Hung, J.-w. (2009) Integrating codebook and utterance information in cepstral statistics normalization techniques for robust speech recognition. Proc. Interspeech 2009, 1239-1242, doi: 10.21437/Interspeech.2009-357

@inproceedings{he09_interspeech,
  author={Guan-min He and Jeih-weih Hung},
  title={{Integrating codebook and utterance information in cepstral statistics normalization techniques for robust speech recognition}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1239--1242},
  doi={10.21437/Interspeech.2009-357}
}