In this study, we investigate an effective feature extraction front-end for improved emotion identification by speech in clean and noisy condition. First, we explore the application of the PMVDR feature for emotion characterization. Originally for accent/dialect and language identification (LID), PMVDR features are less sensitive to noise. Also developed for LID, shifted delta cepstral (SDC) approach can also be used as a means of incorporating additional temporal information about the speech into the feature vectors. As already known, super-segmental characteristics, such as pitch and intensity, can provide beneficial information to emotion recognition and we believe the improvement can be acquired from improved features. We performed evaluation on the Berlin database of emotion speech. The proposed system, PMVDR-SDC, outperforms the baseline system absolutely by 10.1%, which proves the validity of the approach. Furthermore, we find both PMVDR and SDC offers much better robustness in noisy condition than others, which is critical for the real application.
Bibliographic reference. Liu, Gang / Lei, Yun / Hansen, John H. L. (2010): "A novel feature extraction strategy for multi-stream robust emotion identification", In INTERSPEECH-2010, 482-485.