ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Acoustic emotion recognition using dynamic Bayesian networks and multi-space distributions

R. Barra-Chicote, Fernando Fernández, S. Lutfi, Juan Manuel Lucas-Cuesta, J. Macias-Guarasa, J. M. Montero, R. San-Segundo, J. M. Pardo

In this paper we describe the acoustic emotion recognition system built at the Speech Technology Group of the Universidad Politecnica de Madrid (Spain) to participate in the INTERSPEECH 2009 Emotion Challenge. Our proposal is based on the use of a Dynamic Bayesian Network (DBN) to deal with the temporal modelling of the emotional speech information. The selected features (MFCC, F0, Energy and their variants) are modelled as different streams, and the F0 related ones are integrated under a Multi Space Distribution (MSD) framework, to properly model its dual nature (voiced/unvoiced). Experimental evaluation on the challenge test set, show a 67.06% and 38.24% of unweighted recall for the 2 and 5-classes tasks respectively. In the 2-class case, we achieve similar results compared with the baseline, with a considerable less number of features. In the 5-class case, we achieve a statistically significant 6.5% relative improvement.


doi: 10.21437/Interspeech.2009-109

Cite as: Barra-Chicote, R., Fernández, F., Lutfi, S., Lucas-Cuesta, J.M., Macias-Guarasa, J., Montero, J.M., San-Segundo, R., Pardo, J.M. (2009) Acoustic emotion recognition using dynamic Bayesian networks and multi-space distributions. Proc. Interspeech 2009, 336-339, doi: 10.21437/Interspeech.2009-109

@inproceedings{barrachicote09_interspeech,
  author={R. Barra-Chicote and Fernando Fernández and S. Lutfi and Juan Manuel Lucas-Cuesta and J. Macias-Guarasa and J. M. Montero and R. San-Segundo and J. M. Pardo},
  title={{Acoustic emotion recognition using dynamic Bayesian networks and multi-space distributions}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={336--339},
  doi={10.21437/Interspeech.2009-109}
}