One interesting phenomenon of natural conversation is overlapping speech. Besides causing difficulties in automatic speech processing, such overlaps carry information on the state of the overlapper: competitive overlaps (i.e. “interruptions”) can signal disagreement or the feeling of being overlooked, and cooperative overlaps (i.e. supportive interjections) can signal agreement and interest. These hints can be used to improve human-machine interaction. In this paper we present an approach for automatic classification of competitive and cooperative overlaps using the emotional content of the speakers’ utterances before and after the overlap. For these experiments, we use real-world data from human-human interactions in call centres. We also compare our approach to standard acoustic classification on the same data and come to the conclusion, that emotional features are clearly superior to acoustic features for this task, resulting in an unweighted average f-measure of 71.9%. But we also find that acoustic features should not be entirely neglected: using a late fusion procedure, we can further improve the unweighted average f-measure by 2.6%.
Cite as: Egorow, O., Wendemuth, A. (2017) Emotional Features for Speech Overlaps Classification. Proc. Interspeech 2017, 2356-2360, doi: 10.21437/Interspeech.2017-87
@inproceedings{egorow17_interspeech, author={Olga Egorow and Andreas Wendemuth}, title={{Emotional Features for Speech Overlaps Classification}}, year=2017, booktitle={Proc. Interspeech 2017}, pages={2356--2360}, doi={10.21437/Interspeech.2017-87} }