Cepstral normalization for robust speech recognition

Alejandro Acero, Richard M. Stern

In this paper we discuss several issues that concern the development of spoken language systems that are robust to changes in the acoustical environment. For Sphinx, the CMU continuous-speech speaker-independent recognition system, cepstral processing offers the advantages of easier integration, greater computationally efficiency and greater accuracy compared to processing in the spectral domain. We also present algorithms that adapt to new environments by estimating noise level and spectral tilt directly from the input speech, without the need for environment-specific training data. Finally we test our algorithms on a number of different microphones and acoustical environments in an effort to obtain microphone-independent systems.

