Odyssey 2008: The Speaker and Language Recognition Workshop

Stellenbosch, South Africa
January 21-24, 2008

Dimension Reduction of the Modulation Spectrogram for Speaker Verification

Tomi Kinnunen (1), Kong-Aik Lee (2), Haizhou Li (2)

(1) Speech and Image Processing Unit, Department of Computer Science, University of Joensuu, Finland
(2) Speech and Dialogue Processing Lab, Human Language Technology Department Institute for Infocomm Research (I2R), Singapore

A so-called modulation spectrogram is obtained from the conventional speech spectrogram by short-term spectral analysis along the temporal trajectories of the frequency bins. In its original definition, the modulation spectrogram is a highdimensional representation and it is not clear how to extract features from it. In this paper, we define a low-dimensional feature which captures the shape of the modulation spectra. The recognition accuracy of the modulation spectrogram based classifier is improved from our previous result of EER=25.1% to EER=17.4% on the NIST 2001 speaker recognition task.

Full Paper     Presentation (PDF)

Bibliographic reference.  Kinnunen, Tomi / Lee, Kong-Aik / Li, Haizhou (2008): "Dimension reduction of the modulation spectrogram for speaker verification", In Odyssey-2008, paper 030.