11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

A Novel Feature Extraction Strategy for Multi-Stream Robust Emotion Identification

Gang Liu, Yun Lei, John H. L. Hansen

University of Texas at Dallas, USA

In this study, we investigate an effective feature extraction front-end for improved emotion identification by speech in clean and noisy condition. First, we explore the application of the PMVDR feature for emotion characterization. Originally for accent/dialect and language identification (LID), PMVDR features are less sensitive to noise. Also developed for LID, shifted delta cepstral (SDC) approach can also be used as a means of incorporating additional temporal information about the speech into the feature vectors. As already known, super-segmental characteristics, such as pitch and intensity, can provide beneficial information to emotion recognition and we believe the improvement can be acquired from improved features. We performed evaluation on the Berlin database of emotion speech. The proposed system, PMVDR-SDC, outperforms the baseline system absolutely by 10.1%, which proves the validity of the approach. Furthermore, we find both PMVDR and SDC offers much better robustness in noisy condition than others, which is critical for the real application.

Full Paper

Bibliographic reference.  Liu, Gang / Lei, Yun / Hansen, John H. L. (2010): "A novel feature extraction strategy for multi-stream robust emotion identification", In INTERSPEECH-2010, 482-485.