In this paper we propose to incorporate features derived using spectro-temporal receptive fields (STRFs) of neurons in the auditory cortex for the task of phoneme recognition. Each of these STRFs is tuned to different auditory frequencies, scales and modulation rates. We select different sets of STRFs which are specific for phonemes in different broad phonetic classes (BPC) of sounds. These STRFs are then used as spectro-temporal filters on spectrograms of speech to extract features for phoneme recognition. For the phoneme recognition task on the TIMIT database, the proposed features show an improvement of about 5% over conventional feature extraction techniques.
Bibliographic reference. Thomas, Samuel / Patil, Kailash / Ganapathy, Sriram / Mesgarani, Nima / Hermansky, Hynek (2010): "A phoneme recognition framework based on auditory spectro-temporal receptive fields", In INTERSPEECH-2010, 2458-2461.