An Investigation of Emotion Dynamics and Kalman Filtering for Speech-Based Emotion Prediction

Zhaocheng Huang, Julien Epps


Despite recent interest in continuous prediction of dimensional emotions, the dynamical aspect of emotions has received less attention in automated systems. This paper investigates how emotion change can be effectively incorporated to improve continuous prediction of arousal and valence from speech. Significant correlations were found between emotion ratings and their dynamics during investigations on the RECOLA database, and here we examine how to best exploit them using a Kalman filter. In particular, we investigate the correlation between predicted arousal and valence dynamics with arousal and valence ground truth; the Kalman filter internal delay for estimating the state transition matrix; the use of emotion dynamics as a measurement input to a Kalman filter; and how multiple probabilistic Kalman filter outputs can be effectively fused. Evaluation results show that correct dynamics estimation and internal delay settings allow up to 5% and 58% relative improvement in arousal and valence prediction respectively over existing Kalman filter implementations. Fusion based on probabilistic Kalman filter outputs yields further gains.


 DOI: 10.21437/Interspeech.2017-1707

Cite as: Huang, Z., Epps, J. (2017) An Investigation of Emotion Dynamics and Kalman Filtering for Speech-Based Emotion Prediction. Proc. Interspeech 2017, 3301-3305, DOI: 10.21437/Interspeech.2017-1707.


@inproceedings{Huang2017,
  author={Zhaocheng Huang and Julien Epps},
  title={An Investigation of Emotion Dynamics and Kalman Filtering for Speech-Based Emotion Prediction},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={3301--3305},
  doi={10.21437/Interspeech.2017-1707},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1707}
}