Odyssey 2008: The Speaker and Language Recognition Workshop

Stellenbosch, South Africa
January 21-24, 2008

Age and Gender Classification using Modulation Cepstrum

Jitendra Ajmera (1), Felix Burkhardt (2)

(1) Deutsche Telekom Laboratories; (2) T-Systems Enterprise Services GmbH, Berlin, Germany

This paper proposes using modulation cepstrum coefficients instead of cepstral coefficients for extracting metadata information such as age and gender. These coefficients are extracted by applying discrete cosine transform to a time-sequence of cepstral coefficients. Lower order coefficients of this transformation represent smooth cepstral trajectories over time. Results presented in this paper show that cepstral trajectories corresponding to lower (3-14 Hz) modulation frequencies provide best discrimination. The proposed system achieves 50.2% overall accuracy for this 7-class task while accuracy of human labelers on a subset of evaluation material used in this work is 54.7%.

Full Paper     Presentation (PPT)

