Many research efforts in the field of feature extraction for automatic speech recognition are focused on analyzing slow amplitude fluctuations of speech. In this study the importance of spectral and temporal resolution for the amplitude modulation frequency analysis are investigated in order to provide guidance for the appropriate filter design. Therefore, different wavelet and Fourier transform like filter time scales are examined, i.e. the importance of time and frequency separation is compared. The results demonstrate that analyzing three separate amplitude modulation frequency bands of constant bandwidth that cover the range from about 2 to 16 Hz are sufficient for automatic speech recognition.
Index Terms: amplitude modulation, speech recognition, wavelet transform, feature extraction
Bibliographic reference. Moritz, Niko / Anemüller, Jörn / Kollmeier, Birger (2012): "Amplitude modulation filters as feature sets for robust ASR: constant absolute or relative bandwidth?", In INTERSPEECH-2012, 1231-1234.