8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Extended Powered Cepstral Normalization (P-CN) with Range Equalization for Robust Features in Speech Recognition

Chang-wen Hsu, Lin-shan Lee

National Taiwan University, Taiwan

Cepstral normalization has been popularly used as a powerful approach to produce robust features for speech recognition. A new approach of Powered Cepstral Normalization (P-CN) was recently proposed to normalize the MFCC parameters in the r1-th order powered domain, where r1 > 1.0, and then transform the features back by an 1/r2 power order to a better recognition domain, and it was shown to produce robust features. Here we further extend P-CN to a more effective and efficient form, in which we can on-line find good values of r2 for each utterance in real time based on the concept of dynamic range equalization. The basic idea is that the difference in dynamic ranges of feature parameters is in fact a good indicator for the mismatch degrading the recognition performance. Extensive experimental results showed that the Extended P-CN with range equalization proposed in this paper significantly outperforms the conventional Cepstral Normalization and P-CN in all noisy conditions.

Full Paper

Bibliographic reference.  Hsu, Chang-wen / Lee, Lin-shan (2007): "Extended powered cepstral normalization (p-CN) with range equalization for robust features in speech recognition", In INTERSPEECH-2007, 1106-1109.