16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Fisher Vectors with Cascaded Normalization for Paralinguistic Analysis

Heysem Kaya (1), Alexey A. Karpov (2), Albert Ali Salah (1)

(1) Boğaziçi Üniversitesi, Turkey
(2) Russian Academy of Sciences, Russia

Computational Paralinguistics has several unresolved issues, one of which is coping with large variability due to speakers, spoken content and corpora. In this paper, we address the variability compensation issue by proposing a novel method composed of i) Fisher vector encoding of low level descriptors extracted from the signal, ii) speaker z-normalization applied after speaker clustering iii) non-linear normalization of features and iv) classification based on Kernel Extreme Learning Machines and Partial Least Squares regression. For experimental validation, we apply the proposed method on INTERSPEECH 2015 Computational Paralinguistics Challenge (ComParE 2015), Eating Condition sub-challenge, which is a seven-class classification task. In our preliminary experiments, the proposed method achieves an Unweighted Average Recall (UAR) score of 83.1%, outperforming the challenge test set baseline UAR (65.9%) by a large margin.

Full Paper

Bibliographic reference.  Kaya, Heysem / Karpov, Alexey A. / Salah, Albert Ali (2015): "Fisher vectors with cascaded normalization for paralinguistic analysis", In INTERSPEECH-2015, 909-913.