ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Auditory image model features for automatic speech recognition

Mario E. Munich, Qiguang Lin

Conventional speech recognition engines extract Mel Frequency Cepstral Coefficients (MFCC) features from incoming speech. This paper presents a novel approach for feature extraction in which speech is processed according to the Auditory Image Model, a model of human psychoacoustics. We fist describe the proposed front-end, then we present recognition results obtained with the TIMIT database. Comparing with previously published results on the same task, the new approach achieves a 10% improvement in recognition accuracy.


doi: 10.21437/Interspeech.2005-148

Cite as: Munich, M.E., Lin, Q. (2005) Auditory image model features for automatic speech recognition. Proc. Interspeech 2005, 3037-3040, doi: 10.21437/Interspeech.2005-148

@inproceedings{munich05_interspeech,
  author={Mario E. Munich and Qiguang Lin},
  title={{Auditory image model features for automatic speech recognition}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={3037--3040},
  doi={10.21437/Interspeech.2005-148}
}