9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Comparison of AM-FM Based Features for Robust Speech Recognition

K. V. S. Narayana, T. V. Sreenivas

Indian Institute of Science, India

Effective feature extraction for robust speech recognition is a widely addressed topic and currently there is much effort to invoke non-stationary signal models instead of quasi-stationary signal models leading to standard features such as LPC or MFCC. Joint amplitude modulation and frequency modulation (AM-FM) is a classical non-parametric approach to non-stationary signal modeling and recently new feature sets for automatic speech recognition (ASR) have been derived based on a multi-band AM-FM representation of the signal. We consider several of these representations and compare their performances for robust speech recognition in noise, using the AURORA-2 database. We show that FEPSTRUM representation proposed is more effective than others. We also propose an improvement to FEPSTRUM based on the Teager energy operator (TEO) and show that it can selectively outperform even FEPSTRUM.

Full Paper

Bibliographic reference.  Narayana, K. V. S. / Sreenivas, T. V. (2008): "Comparison of AM-FM based features for robust speech recognition", In INTERSPEECH-2008, 1545-1548.