5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Recognizing Emotions in Speech Using Short-term and Long-term Features

Yang Li, Yunxin Zhao

University of Illinois at Urbana-Champaign, USA

The acoustic characteristics of speech are influenced by speakers' emotional status. In this study, we attempted to recognize the emotional status of individual speakers by using speech features that were extracted from short-time analysis frames as well as speech features that represented entire utterances. Principal component analysis was used to analyze the importance of individual features in representing emotional categories. Three classification methods including vector quantization, artificial neural networks and Gaussian mixture density model were used. Classifications using short-term features only, long-term features only and both short-term and long-term features were conducted. The best recognition performance of 62% accuracy was achieved by using the Gaussian mixture density method with both short-term and long-term features.

Full Paper

Bibliographic reference.  Li, Yang / Zhao, Yunxin (1998): "Recognizing emotions in speech using short-term and long-term features", In ICSLP-1998, paper 0379.