Analysis of Speech Emotions in Realistic Environments

Biswajit Dev Sarma, Rohan Kumar Das, Abhishek Dey, Risto Haukioja


´╗┐The classification of emotional speech is a challenging task and it depends critically on the correctness of labeled data. Most of the databases used for research purposes are either acted or simulated. Annotation of such acted database is easier as the actor exaggerates the emotions. On the other hand, emotion labeling on real-world data is very difficult due to confusion among the emotion classes. Another problem in such scenario is the class imbalance, because most of the data is found to be neutral in realistic environment. In this study, we perform emotion labeling on realistic data in a customized manner using emotion priority and confidence level. The annotated speech corpus is then used for analysis and study. Percentage distribution of different emotion classes in the real-world data and the confusions between the emotions during labeling are presented.


 DOI: 10.21437/SMM.2018-3

Cite as: Sarma, B.D., Das, R.K., Dey, A., Haukioja, R. (2018) Analysis of Speech Emotions in Realistic Environments. Proc. Workshop on Speech, Music and Mind 2018, 11-15, DOI: 10.21437/SMM.2018-3.


@inproceedings{Sarma2018,
  author={Biswajit Dev Sarma and Rohan Kumar Das and Abhishek Dey and Risto Haukioja},
  title={Analysis of Speech Emotions in Realistic Environments},
  year=2018,
  booktitle={Proc. Workshop on Speech, Music and Mind 2018},
  pages={11--15},
  doi={10.21437/SMM.2018-3},
  url={http://dx.doi.org/10.21437/SMM.2018-3}
}