ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Recognition of emotion in a realistic dialogue scenario

Richard Huber, Anton Batliner, Jan Buckow, Elmar Nöth, Volker Warnke, Heinrich Niemann

Nowadays modern automatic dialogue systems are able to understand complex sentences instead of only a few commands like Stop or No. In a call-center, such a system should be able to determine in a critical phase of the dialogue if the call should be passed over to a human operator. Such a critical phase can be indicated by the customer's vocal expression. Other studies prooved that it is possible to distinguish between anger and neutral speech with prosodic features alone. Subjects in these studies were mostly people acting or simulating emotions like anger. In this paper we use data from a so-called Wizard of Oz (WoZ) scenario to get more realistic data instead of simulated anger. As shown below, the classification rate for the two classes "emotion" (class E) and "neutral" (class :E) is signiftcantly worse for these more realistic data. Furthermore the classification results are heavily speaker dependent. Prosody alone might thus not be sufficient and has to be supplemented by the use of other knowledge sources such as the detection of repetitions, reformulations, swear words, and dialogue acts.


Cite as: Huber, R., Batliner, A., Buckow, J., Nöth, E., Warnke, V., Niemann, H. (2000) Recognition of emotion in a realistic dialogue scenario. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 665-668

@inproceedings{huber00_icslp,
  author={Richard Huber and Anton Batliner and Jan Buckow and Elmar Nöth and Volker Warnke and Heinrich Niemann},
  title={{Recognition of emotion in a realistic dialogue scenario}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 1, 665-668}
}