EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Recognition of Emotions in Interactive Voice Response Systems

Sherif Yacoub, Steve Simske, Xiaofan Lin, John Burns

Hewlett-Packard Laboratories, USA

This paper reports emotion recognition results from speech signals, with particular focus on extracting emotion features from the short utterances typical of Interactive Voice Response (IVR) applications. We focus on distinguishing anger versus neutral speech, which is salient to call center applications. We report on classification of other types of emotions such as sadness, boredom, happy, and cold anger. We compare results from using neural networks, Support Vector Machines (SVM), K-Nearest Neighbors, and decision trees. We use a database from the Linguistic Data Consortium at University of Pennsylvania, which is recorded by 8 actors expressing 15 emotions. Results indicate that hot anger and neutral utterances can be distinguished with over 90% accuracy. We show results from recognizing other emotions. We also illustrate which emotions can be clustered together using the selected prosodic features.

Full Paper

Bibliographic reference.  Yacoub, Sherif / Simske, Steve / Lin, Xiaofan / Burns, John (2003): "Recognition of emotions in interactive voice response systems", In EUROSPEECH-2003, 729-732.