This paper describes an algorithm for detection of non-linguistic vocalisations, such as laughter or fillers, based on acoustic features. The algorithm proposed combines the benefits of Gaussian mixture models (GMM) and the advantages of support vector machines (SVMs). Three GMMs were trained for garbage, laughter, and fillers, and then an SVM model was trained in the GMM score space. Various experiments were run to tune the parameters of the proposed algorithm, using the data sets originating from the SSPNet Vocalisation Corpus (SVC) provided for the Social Signals Sub-Challenge of the INTERSPEECH 2013 Computational Paralinguistics Challenge. The results showed a remarkable growth of the unweighted average of the area under the receiver operating curve (UAAUC) compared to the baseline results (from 87.6% to over 94% for the development set), which confirmed the efficiency of the proposed method.
Bibliographic reference. Janicki, Artur (2013): "Non-linguistic vocalisation recognition based on hybrid GMM-SVM approach", In INTERSPEECH-2013, 153-157.