In this paper, we proposed a new feature extraction method for emotion recognition based on the knowledge of the emotion production mechanism in physiology. It was reported by physiacoustist that emotional speech is differently encoded from the normal speech in terms of articulation organs and that emotion information in speech is concentrated in different frequencies caused by the different movements of organs . To apply these findings, in this paper, we first quantified the distribution of speech emotion information along with each frequency band by exploiting the Fisher’s F-Ratio and mutual information techniques, and then proposed a non-uniform sub-band processing method which is able to extract and emphasize the emotion features in speech. These extracted features are finally applied to emotional recognition. Experimental results in speech emotion recognition showed that the extracted features using our proposed non-uniform sub-band processing outperform the traditional (MFCC) features, and the average error reduction rate amounts to 16.8% for speech emotion recognition.
Bibliographic reference. Zhou, Yu / Sun, Yanqing / Li, Junfeng / Zhang, Jianping / Yan, Yonghong (2009): "Physiologically-inspired feature extraction for emotion recognition", In INTERSPEECH-2009, 1975-1978.