10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Auditory Model Based Optimization of MFCCs Improves Automatic Speech Recognition Performance

Saikat Chatterjee, Christos Koniaris, W. Bastiaan Kleijn

KTH, Sweden

Using a spectral auditory model along with perturbation based analysis, we develop a new framework to optimize a set of features such that it emulates the behavior of the human auditory system. The optimization is carried out in an off-line manner based on the conjecture that the local geometries of the feature domain and the perceptual auditory domain should be similar. Using this principle, we modify and optimize the static mel frequency cepstral coefficients (MFCCs) without considering any feedback from the speech recognition system. We show that improved recognition performance is obtained for any environmental condition, clean as well as noisy.

Full Paper

Bibliographic reference.  Chatterjee, Saikat / Koniaris, Christos / Kleijn, W. Bastiaan (2009): "Auditory model based optimization of MFCCs improves automatic speech recognition performance", In INTERSPEECH-2009, 2987-2990.