Speech Emotion Recognition Using Affective Saliency

Arodami Chorianopoulou, Polychronis Koutsakis, Alexandros Potamianos


We investigate an affective saliency approach for speech emotion recognition of spoken dialogue utterances that estimates the amount of emotional information over time. The proposed saliency approach uses a regression model that combines features extracted from the acoustic signal and the posteriors of a segment-level classifier to obtain frame or segment-level ratings. The affective saliency model is trained using a minimum classification error (MCE) criterion that learns the weights by optimizing an objective loss function related to the classification error rate of the emotion recognition system. Affective saliency scores are then used to weight the contribution of frame-level posteriors and/or features to the speech emotion classification decision. The algorithm is evaluated for the task of anger detection on four call-center datasets for two languages, Greek and English, with good results.


DOI: 10.21437/Interspeech.2016-1311

Cite as

Chorianopoulou, A., Koutsakis, P., Potamianos, A. (2016) Speech Emotion Recognition Using Affective Saliency. Proc. Interspeech 2016, 500-504.

Bibtex
@inproceedings{Chorianopoulou+2016,
author={Arodami Chorianopoulou and Polychronis Koutsakis and Alexandros Potamianos},
title={Speech Emotion Recognition Using Affective Saliency},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1311},
url={http://dx.doi.org/10.21437/Interspeech.2016-1311},
pages={500--504}
}