We describe extensive experiments on the recognition of emotion from speech using acoustic features only. Two databases of acted emotional speech (Berlin and DES) have been used in this work. The principal focus is on methods for selection of good features from a relatively large set of hand-crafted features, perhaps formed by fusing different feature sets used by different researchers. We show that the monotonic assumption underlying popular sequential selection algorithms does not hold, and use this finding to improve recognition accuracy. We show further that a very simple classifier (k-nearest neighbour) produces better results than any so far reported by other researchers on these databases, suggesting that previous work has failed to match the complexity of the classifier used to the complexity of the data. Finally, several potentially fruitful avenues for future work are outlined.
Bibliographic reference. Hassan, Ali / Damper, Robert I. (2009): "Emotion recognition from speech using extended feature selection and a simple classifier", In INTERSPEECH-2009, 2043-2046.