LSTM Based Cross-corpus and Cross-task Acoustic Emotion Recognition

Heysem Kaya, Dmitrii Fedotov, Ali Yeşilkanat, Oxana Verkholyak, Yang Zhang, Alexey Karpov

Acoustic emotion recognition is a popular and central research direction in paralinguistic analysis, due its relation to a wide range of affective states/traits and manifold applications. Developing highly generalizable models still remains as a challenge for researchers and engineers, because of multitude of nuisance factors. To assert generalization, deployed models need to handle spontaneous speech recorded under different acoustic conditions compared to the training set. This requires that the models are tested for cross-corpus robustness. In this work, we first investigate the suitability of Long-Short-Term-Memory (LSTM) models trained with time- and space-continuously annotated affective primitives for cross-corpus acoustic emotion recognition. We next employ an effective approach to use the frame level valence and arousal predictions of LSTM models for utterance level affect classification and apply this approach on the ComParE 2018 challenge corpora. The proposed method alone gives motivating results both on development and test set of the Self-Assessed Affect Sub-Challenge. On the development set, the cross-corpus prediction based method gives a boost to performance when fused with top components of the baseline system. Results indicate the suitability of the proposed method for both time-continuous and utterance level cross-corpus acoustic emotion recognition tasks.

 DOI: 10.21437/Interspeech.2018-2298

Cite as: Kaya, H., Fedotov, D., Yeşilkanat, A., Verkholyak, O., Zhang, Y., Karpov, A. (2018) LSTM Based Cross-corpus and Cross-task Acoustic Emotion Recognition. Proc. Interspeech 2018, 521-525, DOI: 10.21437/Interspeech.2018-2298.

  author={Heysem Kaya and Dmitrii Fedotov and Ali Yeşilkanat and Oxana Verkholyak and Yang Zhang and Alexey Karpov},
  title={LSTM Based Cross-corpus and Cross-task Acoustic Emotion Recognition},
  booktitle={Proc. Interspeech 2018},