15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Building a Naturalistic Emotional Speech Corpus by Retrieving Expressive Behaviors from Existing Speech Corpora

Soroosh Mariooryad, Reza Lotfian, Carlos Busso

University of Texas at Dallas, USA

A key element in affective computing is to have large corpora of genuine emotional samples collected during natural conversations. Recording natural interactions through telephone is an appealing approach to build emotional databases. However, collecting real conversational data with expressive reactions is a challenging task, especially if the recordings are to be shared with the community (e.g., privacy concerns). This study explores a novel approach consisting in retrieving emotional reactions from existing spontaneous speech databases collected for general speech processing problems. Although most of the recordings in these databases are expected to have non-emotional expressions, given the naturalness of the interactions, the flow of the conversation can lead to emotional responses from conversation partners which we aim to retrieve. We use the IEMOCAP and SEMAINE databases to build emotion detector systems. We use these classifiers to identify emotional behaviors from the FISHER database, which is a large conversational speech corpus recorded over the phone. Subjective evaluations over the retrieved samples demonstrate the potential of the proposed scheme to build naturalistic emotional speech database.

Full Paper

Bibliographic reference.  Mariooryad, Soroosh / Lotfian, Reza / Busso, Carlos (2014): "Building a naturalistic emotional speech corpus by retrieving expressive behaviors from existing speech corpora", In INTERSPEECH-2014, 238-242.