This paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Less energy speech regions contain usually sounds of unvoiced nature where are included nearly half of the consonants, and are by nature the least reliable ones due to the effective noise presence even when the speech is acquired under controlled conditions. This spectral normalisation was tested under additive artificial white noise in an Isolated Speech Recogniser and showed very promising results. It is well known that concerned to speech representation, MFCC parameters appear to be more effective than power spectrum based features. This paper shows how the cepstral speech representation can take advantage of the abovereferred spectral normalisation and shows some results in the continuous speech recognition paradigm in clean and artificial noise conditions.
Cite as: Lima, C.S., Tavares, A.C., Silva, C.A., Oliveira, J.F. (2004) Spectral normalisation MFCC derived features for robust speech recognition. Proc. 9th Conference on Speech and Computer (SPECOM 2004), 120-127
@inproceedings{lima04_specom, author={Carlos S. Lima and Adriano C. Tavares and Carlos A. Silva and Jorge F. Oliveira}, title={{Spectral normalisation MFCC derived features for robust speech recognition}}, year=2004, booktitle={Proc. 9th Conference on Speech and Computer (SPECOM 2004)}, pages={120--127} }