INTERSPEECH 2004 - ICSLP
8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

An Acoustic Study of Emotions Expressed in Speech

Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe Kazemzadeh, Zhigang Deng, Sungbok Lee, Shrikanth Narayanan, Carlos Busso

University of Southern California, USA

In this study, we investigate acoustic properties of speech associated with four different emotions (sadness, anger, happiness, and neutral) intentionally expressed in speech by an actress. The aim is to obtain detailed acoustic knowledge on how speech is modulated when speaker's emotion changes from neutral to a certain emotional state. It is based on measurements of acoustic parameters related to speech prosody, vowel articulation and spectral energy distribution. Acoustic similarities and differences among the emotions are then explored with mutual information computation, multidimensional scaling, and comparison of acoustic likelihoods relative to the neutral emotion. In addition, acoustic separability of the emotions is tested using the discriminant analysis at the utterance level and the result is compared with human evaluation. Results show that happiness/anger and neutral/sadness share similar acoustic properties in this speaker. Speech associated with anger and happiness are characterized by longer utterance duration, shorter inter-word silence, higher pitch and energy values with wider ranges, showing the characteristics of exaggerated or hyperarticulated speech. The discriminant analysis indicates that within-group acoustic separability is relatively poor, suggesting that conventional acoustic parameters examined in this study are not effective in describing the emotions along the valance (or pleasure) dimension. It is noted that RMS energy, inter-word silence and speaking rate are useful in distinguishing sadness from others. Interestingly, the between-group difference in formant patterns seems better reflected in back vowels such as /a/ (/father/) than in the front vowels. Larger lip opening and/or more tongue constriction at the mid or rear part of the vocal tract could be underlying reasons.

Full Paper

Bibliographic reference.  Yildirim, Serdar / Bulut, Murtaza / Lee, Chul Min / Kazemzadeh, Abe / Deng, Zhigang / Lee, Sungbok / Narayanan, Shrikanth / Busso, Carlos (2004): "An acoustic study of emotions expressed in speech", In INTERSPEECH-2004, 2193-2196.