Speech Features for Depression Detection

Saurabh Sahu, Carol Espy-Wilson


In this paper we discuss speech features that are useful in the detection of depression. Neuro-physiological changes associated with depression affect motor coordination and can disrupt articulatory precision in speech. We use the Mundt database and focus on six speakers in the database that transitioned between being depressed and not depressed based on their Hamilton depression scores. We quantify the degree of breathiness, jitter and shimmer computed from an AMDF based parameter. Measures from sustained vowels spoken in isolation show that all of these attributes can increase when a person is depressed. In this study, we focused on using features from free-flowing speech to classify the depressed state of an individual. To do so we looked at vowel regions that look the most like sustained vowels. We train an SVM for each speaker and do a speaker dependent classification of the test speech frames. Using the AMDF based feature we got a better accuracy (62–87% frame-wise accuracy for 5 out of 6 speakers) for most speakers than 13 dimensional MFCC along with its velocity and acceleration coefficients. Using the AMDF based feature, we also trained a speaker independent SVM which gave an average accuracy of 77.8% for utterance based classification.


DOI: 10.21437/Interspeech.2016-1566

Cite as

Sahu, S., Espy-Wilson, C. (2016) Speech Features for Depression Detection. Proc. Interspeech 2016, 1928-1932.

Bibtex
@inproceedings{Sahu+2016,
author={Saurabh Sahu and Carol Espy-Wilson},
title={Speech Features for Depression Detection},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1566},
url={http://dx.doi.org/10.21437/Interspeech.2016-1566},
pages={1928--1932}
}