ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Arousal and valence prediction in spontaneous emotional speech: felt versus perceived emotion

Khiet P. Truong, David A. van Leeuwen, Mark A. Neerincx, Franciska M. G. de Jong

In this paper, we describe emotion recognition experiments carried out for spontaneous affective speech with the aim to compare the added value of annotation of felt emotion versus annotation of perceived emotion. Using speech material available in the tno-gaming corpus (a corpus containing audiovisual recordings of people playing videogames), speech-based affect recognizers were developed that can predict Arousal and Valence scalar values. Two types of recognizers were developed in parallel: one trained with felt emotion annotations (generated by the gamers themselves) and one trained with perceived/observed emotion annotations (generated by a group of observers). The experiments showed that, in speech, with the methods and features currently used, observed emotions are easier to predict than felt emotions. The results suggest that recognition performance strongly depends on how and by whom the emotion annotations are carried out.


doi: 10.21437/Interspeech.2009-583

Cite as: Truong, K.P., Leeuwen, D.A.v., Neerincx, M.A., Jong, F.M.G.d. (2009) Arousal and valence prediction in spontaneous emotional speech: felt versus perceived emotion. Proc. Interspeech 2009, 2027-2030, doi: 10.21437/Interspeech.2009-583

@inproceedings{truong09_interspeech,
  author={Khiet P. Truong and David A. van Leeuwen and Mark A. Neerincx and Franciska M. G. de Jong},
  title={{Arousal and valence prediction in spontaneous emotional speech: felt versus perceived emotion}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={2027--2030},
  doi={10.21437/Interspeech.2009-583}
}