10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Cepstral and Long-Term Features for Emotion Recognition

Pierre Dumouchel (1), Najim Dehak (1), Yazid Attabi (1), Réda Dehak (2), Narjès Boufaden (1)

(1) CRIM, Canada
(2) LRDE, France

In this paper, we describe systems that were developed for the Open Performance Sub-Challenge of the INTERSPEECH 2009 Emotion Challenge. We participate in both two-class and five-class emotion detection. For the two-class problem, the best performance is obtained by logistic regression fusion of three systems. These systems use short- and long-term speech features. Fusion allowed to an absolute improvement of 2.6% on the unweighted recall value compared with [1]. For the five-class problem, we submitted two individual systems: cepstral GMM vs. long-term GMM-UBM. The best result comes from a cepstral GMM and produces an absolute improvement of 3.5% compared to [2].


  1. B. Schüller, S. Steidl, and A. Batliner, “The Interspeech 2009 Emotion Challenge,” in Interspeech. Brighton, UK: ISCA, 2009.
  2. C.-Y. Lin and H-C.Wang, “Language Identification Using Pitch Contour Information,” in ICASSP, 2005, pp. 601–604.

Full Paper

Bibliographic reference.  Dumouchel, Pierre / Dehak, Najim / Attabi, Yazid / Dehak, Réda / Boufaden, Narjès (2009): "Cepstral and long-term features for emotion recognition", In INTERSPEECH-2009, 344-347.