ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction

Chanwoo Kim, Richard M. Stern

This paper presents a new feature extraction algorithm called Power-Normalized Cepstral Coefficients (PNCC) that is based on auditory processing. Major new features of PNCC processing include the use of a power-law nonlinearity that replaces the traditional log nonlinearity used for MFCC coefficients, and a novel algorithm that suppresses background excitation by estimating SNR based on the ratio of the arithmetic to geometric mean power, and subtracts the inferred background power. Experimental results demonstrate that the PNCC processing provides substantial improvements in recognition accuracy compared to MFCC and PLP processing for various types of additive noise. The computational cost of PNCC is only slightly greater than that of conventional MFCC processing.


doi: 10.21437/Interspeech.2009-5

Cite as: Kim, C., Stern, R.M. (2009) Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction. Proc. Interspeech 2009, 28-31, doi: 10.21437/Interspeech.2009-5

@inproceedings{kim09_interspeech,
  author={Chanwoo Kim and Richard M. Stern},
  title={{Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={28--31},
  doi={10.21437/Interspeech.2009-5}
}