13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

A Preliminary Study on Cross-Databases Emotion Recognition using the Glottal Features in Speech

Rui Sun, Elliot Moore II

Department of Electrical and Computer Engineering, Georgia Institute of Technology, Savannah, GA, USA

While the majority of traditional research in emotional speech recognition has focused on the use of a single database for assessment, it is clear that the lack of large databases has presented a significant challenge in generalizing results for the purposes of building a robust emotion classification system. Recently, work has been reported on cross-training emotional databases to examine consistency and reliability of acoustic measures in performing emotional assessment. This paper presents preliminary results on the use of glottal-based features in cross-testing (i.e., training on one database and testing on another) across 3 databases for emotion recognition of neutral, angry, happy, and sad. A comparative study is also presented using pitch-based features. The results suggest that the glottal features are more robust to the 4-class emotion classification system developed in this study and are able to perform well above chance for several of the cross-testing experiments.

Index Terms: emotion recognition, cross-databases, glottal features, pitch

Full Paper

Bibliographic reference.  Sun, Rui / Moore II, Elliot (2012): "A preliminary study on cross-databases emotion recognition using the glottal features in speech", In INTERSPEECH-2012, 1628-1631.