13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

The log-Gabor Method: Speech Classification Using Spectrogram Image Analysis

Harm Buisman, Eric Postma

Tilburg center for Cognition and Communication, Tilburg University, Tilburg, The Netherlands

We explored the suitability of the log-Gabor method, a speech analysis method inspired by Ezzat, Bouvrie & Poggio (2007), for automatic classification of personality and likability traits in speech. The core idea underlying the log-Gabor method is to treat the spectrogram as an image of spectro-temporal information. The image is transformed into Gabor energy values using the two-dimensional logarithmic Gabor transform, which is a standard feature extraction method in visual texture analysis. The aggregated energy values are mapped onto classes by means of a support vector machine (SVM). The log-Gabor method performed above baseline on both the INTERSPEECH Personality and Likability Sub-Challenges: 74.2% on the Likability task (baseline 58.0%) and 78.1% on the Personality task (baseline 70.3%). These results lead us to conclude that the log-Gabor method is a feasible method for extracting perceptual cues from speech.

Index Terms: spectro-temporal analysis, spectrogram analysis, log Gabor filters, likability classification, personality classification, support vector machines

Full Paper

Bibliographic reference.  Buisman, Harm / Postma, Eric (2012): "The log-Gabor method: speech classification using spectrogram image analysis", In INTERSPEECH-2012, 518-521.