9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Abandoning Emotion Classes - Towards Continuous Emotion Recognition with Modelling of Long-Range Dependencies

Martin Wöllmer (1), Florian Eyben (1), Stephan Reiter (2), Björn Schuller (1), Cate Cox (3), Ellen Douglas-Cowie (3), Roddy Cowie (3)

(1) Technische Universität München, Germany; (2) EB Automotive GmbH, Germany; (3) Queen's University Belfast, UK

Class based emotion recognition from speech, as performed in most works up to now, entails many restrictions for practical applications. Human emotion is a continuum and an automatic emotion recognition system must be able to recognise it as such. We present a novel approach for continuous emotion recognition based on Long Short-Term Memory Recurrent Neural Networks which include modelling of long-range dependencies between observations and thus outperform techniques like Support-Vector Regression. Transferring the innovative concept of additionally modelling emotional history to the classification of discrete levels for the emotional dimensions "valence" and "activation" we also apply Conditional Random Fields which prevail over the commonly used Support-Vector Machines. Experiments conducted on data that was recorded while humans interacted with a Sensitive Artificial Listener prove that for activation the derived classifiers perform as well as human annotators.

Full Paper

Bibliographic reference.  Wöllmer, Martin / Eyben, Florian / Reiter, Stephan / Schuller, Björn / Cox, Cate / Douglas-Cowie, Ellen / Cowie, Roddy (2008): "Abandoning emotion classes - towards continuous emotion recognition with modelling of long-range dependencies", In INTERSPEECH-2008, 597-600.