The SRI CLEO Speaker-State Corpus

Andreas Kathol, Elizabeth Shriberg, Massimilano de Zambotti


We introduce the SRI CLEO (Conversational Language about Everyday Objects) Speaker-State Corpus of speech, video, and biosignals. The goal of the corpus is providing insight on the speech and physiological changes resulting from subtle, context-based influences on affect and cognition. Speakers were prompted by collections of pictures of neutral everyday objects and were instructed to provide speech related to any subset of the objects for a preset period of time (120 or 180 seconds depending on task).

The corpus provides signals for 43 speakers under four different speaker-state conditions: (1) neutral and emotionally charged audiovisual background; (2) cognitive load; (3) time pressure; and (4) various acted emotions. Unlike previous studies that have linked speaker state to the content of the speaking task itself, the CLEO prompts remain largely pragmatically, semantically, and affectively neutral across all conditions. This framework enables for more direct comparisons across both conditions and speakers. The corpus also includes more traditional speaker tasks involving reading and free-form reporting of neutral and emotionally charged content. The explored biosignals include skin conductance, respiration, blood pressure, and ECG. The corpus is in the final stages of processing and will be made available to the research community.


DOI: 10.21437/Interspeech.2016-1141

Cite as

Kathol, A., Shriberg, E., Zambotti, M.d. (2016) The SRI CLEO Speaker-State Corpus. Proc. Interspeech 2016, 1541-1544.

Bibtex
@inproceedings{Kathol+2016,
author={Andreas Kathol and Elizabeth Shriberg and Massimilano de Zambotti},
title={The SRI CLEO Speaker-State Corpus},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1141},
url={http://dx.doi.org/10.21437/Interspeech.2016-1141},
pages={1541--1544}
}