11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

On the Importance of Glottal Flow Spectral Energy for the Recognition of Emotions in Speech

Ling He (1), Margaret Lech (1), Nicholas Allen (2)

(1) RMIT University, Australia
(2) University of Melbourne, Australia

Two new approaches to feature extraction for automatic emotion classification in speech are described and tested. The methods are based on recent laryngological experiments testing the glottal air flow during phonation. The proposed approach calculates the area under the spectral energy envelope of the speech signal (AUSEES) and the glottal waveform (AUSEEG). The new methods provided very high recognition rates for seven emotions (contempt, angry, anxious, dysphoric, pleasant, neutral and happy). The speech data included 170 adult speakers (95 female and 75 male). The classification results showed that the new features provided significantly higher classification results (89.95% for AUSEEG, 76.07% for AUSEES) compared to the baseline MFCC approach (37.81%). The glottal waveform based AUSEEG features provided better results than the speech based AUSEES features, indicating that the majority of the emotion information is likely to be added to speech during the glottal wave formation.

Full Paper

Bibliographic reference.  He, Ling / Lech, Margaret / Allen, Nicholas (2010): "On the importance of glottal flow spectral energy for the recognition of emotions in speech", In INTERSPEECH-2010, 2346-2349.