Two new approaches to feature extraction for automatic emotion classification in speech are described and tested. The methods are based on recent laryngological experiments testing the glottal air flow during phonation. The proposed approach calculates the area under the spectral energy envelope of the speech signal (AUSEES) and the glottal waveform (AUSEEG). The new methods provided very high recognition rates for seven emotions (contempt, angry, anxious, dysphoric, pleasant, neutral and happy). The speech data included 170 adult speakers (95 female and 75 male). The classification results showed that the new features provided significantly higher classification results (89.95% for AUSEEG, 76.07% for AUSEES) compared to the baseline MFCC approach (37.81%). The glottal waveform based AUSEEG features provided better results than the speech based AUSEES features, indicating that the majority of the emotion information is likely to be added to speech during the glottal wave formation.
Bibliographic reference. He, Ling / Lech, Margaret / Allen, Nicholas (2010): "On the importance of glottal flow spectral energy for the recognition of emotions in speech", In INTERSPEECH-2010, 2346-2349.