A so-called modulation spectrogram is obtained from the conventional speech spectrogram by short-term spectral analysis along the temporal trajectories of the frequency bins. In its original definition, the modulation spectrogram is a highdimensional representation and it is not clear how to extract features from it. In this paper, we define a low-dimensional feature which captures the shape of the modulation spectra. The recognition accuracy of the modulation spectrogram based classifier is improved from our previous result of EER=25.1% to EER=17.4% on the NIST 2001 speaker recognition task.
Cite as: Kinnunen, T., Lee, K.-A., Li, H. (2008) Dimension reduction of the modulation spectrogram for speaker verification. Proc. The Speaker and Language Recognition Workshop (Odyssey 2008), paper 30
@inproceedings{kinnunen08_odyssey, author={Tomi Kinnunen and Kong-Aik Lee and Haizhou Li}, title={{Dimension reduction of the modulation spectrogram for speaker verification}}, year=2008, booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2008)}, pages={paper 30} }