ISCA Archive SpeechProsody 2010
ISCA Archive SpeechProsody 2010

Characterization of emotions using the dynamics of prosodic features

K. Sreenivasa Rao, Ramu Reddy, Sudhamay Maity, Shashidhar G. Koolagudi

In this paper the dynamics of prosodic parameters are explored for recognizing the emotions from speech. The dynamics of prosodic parameters refer to local or fine variations in prosodic parameters with respect to time. The proposed dynamic features of prosody are represented by: (1) sequence of durations of syllables in the utterance (duration contour), (2) sequence of fundamental frequency values (pitch contour) and (3) sequence of frame energy values (energy contour). Indian Institute of Technology Kharagpur Simulated Emotion Speech Corpus (IITKGP-SESC) is used for analyzing the proposed prosodic features for recognizing the emotions [1]. The emotions considered in this work are anger, disgust, fear, happiness neutral and sadness. Support vector machines (SVM) are explored to discriminate the emotions using the proposed prosodic features. Emotion recognition performance is analyzed separately, using duration patterns of the sequence of syllables, pitch contours and energy contours, and their recognition performance is observed to be 64%, 67% and 53% respectively. Fusion techniques are explored at feature and score levels. The performance of the fusion-based emotion recognition systems is observed to be 69% and 74% for feature and score level fusions, respectively.

Cite as: Rao, K.S., Reddy, R., Maity, S., Koolagudi, S.G. (2010) Characterization of emotions using the dynamics of prosodic features. Proc. Speech Prosody 2010, paper 941

  author={K. Sreenivasa Rao and Ramu Reddy and Sudhamay Maity and Shashidhar G. Koolagudi},
  title={{Characterization of emotions using the dynamics of prosodic features}},
  booktitle={Proc. Speech Prosody 2010},
  pages={paper 941}