Gabor features have been proposed for extracting spec-tro-temporal modulation information, and yielding significant improvements in recognition performance. In this paper, we propose the integration of Gabor posteriors with MFCC post-eriors, yielding a relative improvement of 14.3% over an MFCC Tandem system. We analyze for different types of acoustic units the complementarity between Gabor features with long-term spectro-temporal modulation information in the mel-spectrogram and MFCC features with short-term temporal information in the cepstral domain. It is found that Gabor features are better for vowel recognition while MFCCs are better for consonants. This explains why their integration offers improvements.
Bibliographic reference. Li, Shang-wen / Sun, Liang-che / Lee, Lin-shan (2010): "Improved phoneme recognition by integrating evidence from spectro-temporal and cepstral features", In INTERSPEECH-2010, 1177-1180.