Computational Paralinguistics: Automatic Assessment of Emotions, Mood and Behavioural State from Acoustics of Speech

Zafi Sherhan Syed, Julien Schroeter, Kirill Sidorov, David Marshall


Paralinguistic analysis of speech remains a challenging task due to the many confounding factors which affect speech production. In this paper, we address the Interspeech 2018 Computational Paralinguistics Challenge (ComParE) which aims to push the boundaries of sensitivity to non-textual information that is conveyed in the acoustics of speech. We attack the problem on several fronts. We posit that a substantial amount of paralinguistic information is contained in spectral features alone. To this end, we use a large ensemble of Extreme Learning Machines for classification of spectral features. We further investigate the applicability of (an ensemble of) CNN-GRUs networks to model temporal variations therein. We report on the details of the experiments and the results for three ComParE sub-challenges: Atypical Affect, Self-Assessed Affect and Crying. Our results compare favourably and in some cases exceed the published state-of-the-art performance.


 DOI: 10.21437/Interspeech.2018-2019

Cite as: Syed, Z.S., Schroeter, J., Sidorov, K., Marshall, D. (2018) Computational Paralinguistics: Automatic Assessment of Emotions, Mood and Behavioural State from Acoustics of Speech. Proc. Interspeech 2018, 511-515, DOI: 10.21437/Interspeech.2018-2019.


@inproceedings{Syed2018,
  author={Zafi Sherhan Syed and Julien Schroeter and Kirill Sidorov and David Marshall},
  title={Computational Paralinguistics: Automatic Assessment of Emotions, Mood and Behavioural State from Acoustics of Speech},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={511--515},
  doi={10.21437/Interspeech.2018-2019},
  url={http://dx.doi.org/10.21437/Interspeech.2018-2019}
}