General Utterance-Level Feature Extraction for Classifying Crying Sounds, Atypical & Self-Assessed Affect and Heart Beats

Gábor Gosztolya, Tamás Grósz, László Tóth


In the area of computational paralinguistics, there is a growing need for general techniques that can be applied in a variety of tasks and which can be easily realized using standard and publicly available tools. In our contribution to the 2018 Interspeech Computational Paralinguistic Challenge (ComParE), we test four general ways of extracting features. Besides the standard ComParE feature set consisting of 6373 diverse attributes, we experiment with two variations of Bag-of-Audio-Words representations, and define a simple feature set inspired by Gaussian Mixture Models. Our results indicate that the UAR scores obtained via the different approaches vary among the tasks. In our view, this is mainly because most feature sets tested were local by nature and they could not properly represent the utterances of the Atypical Affect and Self-Assessed Affect Sub- Challenges. On the Crying Sub-Challenge, however, a simple combination of all four feature sets proved to be effective.


 DOI: 10.21437/Interspeech.2018-1076

Cite as: Gosztolya, G., Grósz, T., Tóth, L. (2018) General Utterance-Level Feature Extraction for Classifying Crying Sounds, Atypical & Self-Assessed Affect and Heart Beats. Proc. Interspeech 2018, 531-535, DOI: 10.21437/Interspeech.2018-1076.


@inproceedings{Gosztolya2018,
  author={Gábor Gosztolya and Tamás Grósz and László Tóth},
  title={General Utterance-Level Feature Extraction for Classifying Crying Sounds, Atypical & Self-Assessed Affect and Heart Beats},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={531--535},
  doi={10.21437/Interspeech.2018-1076},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1076}
}