ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Introducing temporal asymmetries in feature extraction for automatic speech recognition

G. S. V. S. Sivaram, Hynek Hermansky

We propose a new auditory inspired feature extraction technique for automatic speech recognition (ASR). Features are extracted by filtering the temporal trajectory of spectral energies in each critical band of speech by a bank of finite impulse response (FIR) filters. Impulse responses of these filters are derived from a modified Gabor envelope in order to emulate asymmetries of the temporal receptive field (TRF) profiles observed in higher level auditory neurons. We obtain 11.4% relative improvement in word error rate on OGI-Digits database and, 3.2% relative improvement in phoneme error rate on TIMIT database over the MRASTA technique.


doi: 10.21437/Interspeech.2008-207

Cite as: Sivaram, G.S.V.S., Hermansky, H. (2008) Introducing temporal asymmetries in feature extraction for automatic speech recognition. Proc. Interspeech 2008, 890-893, doi: 10.21437/Interspeech.2008-207

@inproceedings{sivaram08_interspeech,
  author={G. S. V. S. Sivaram and Hynek Hermansky},
  title={{Introducing temporal asymmetries in feature extraction for automatic speech recognition}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={890--893},
  doi={10.21437/Interspeech.2008-207}
}