Speech emotion recognition is an interesting and challenging speech technology, which can be applied to broad areas. In this paper, we propose to fuse the global statistical and segmental spectral features at the decision level for speech emotion recognition. Each emotional utterance is individually scored by two recognition systems, the global statistics-based and segmental spectrum-based systems, and a weighted linear combination is applied to fuse their scores for final decision. Experimental results on an emotional speech database demonstrate that the global statistical and segmental spectral features are complementary, and the proposed fusion approach further improves the performance of the emotion recognition system.
Bibliographic reference. Hu, Hao / Xu, Ming-Xing / Wu, Wei (2007): "Fusion of global statistical and segmental spectral features for speech emotion recognition", In INTERSPEECH-2007, 2269-2272.