14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Modeling Spectral Variability for the Classification of Depressed Speech

Nicholas Cummins (1), Julien Epps (1), Vidhyasaharan Sethu (1), Michael Breakspear (1), Roland Goecke (2)

(1) University of New South Wales, Australia
(2) University of Canberra, Australia

Quantifying how the spectral content of speech relates to changes in mental state may be crucial in building an objective speechbased depression classification system with clinical utility. This paper investigates the hypothesis that important depression based information can be captured within the covariance structure of a Gaussian Mixture Model (GMM) of recorded speech. Significant negative correlations found between a speaker's average weighted variance . a GMM-based indicator of speaker variability . and their level of depression support this hypothesis. Further evidence is provided by the comparison of classification accuracies from seven different GMM-UBM systems, each formed by varying different parameter combinations during MAP adaption. This analysis shows that variance-only adaptation either outperforms or matches the de facto standard mean-only adaptation when classifying both the presence and severity of depression. This result is perhaps the first of its kind seen in GMM-UBM speech classification.

Full Paper

Bibliographic reference.  Cummins, Nicholas / Epps, Julien / Sethu, Vidhyasaharan / Breakspear, Michael / Goecke, Roland (2013): "Modeling spectral variability for the classification of depressed speech", In INTERSPEECH-2013, 857-861.