10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Feature Extraction for Robust Speech Recognition Using a Power-Law Nonlinearity and Power-Bias Subtraction

Chanwoo Kim, Richard M. Stern

Carnegie Mellon University, USA

This paper presents a new feature extraction algorithm called Power-Normalized Cepstral Coefficients (PNCC) that is based on auditory processing. Major new features of PNCC processing include the use of a power-law nonlinearity that replaces the traditional log nonlinearity used for MFCC coefficients, and a novel algorithm that suppresses background excitation by estimating SNR based on the ratio of the arithmetic to geometric mean power, and subtracts the inferred background power. Experimental results demonstrate that the PNCC processing provides substantial improvements in recognition accuracy compared to MFCC and PLP processing for various types of additive noise. The computational cost of PNCC is only slightly greater than that of conventional MFCC processing.

Full Paper

Bibliographic reference.  Kim, Chanwoo / Stern, Richard M. (2009): "Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction", In INTERSPEECH-2009, 28-31.