Investigating the Effect of Audio Duration on Dementia Detection Using Acoustic Features

Jochen Weiner, Miguel Angrick, Srinivasan Umesh, Tanja Schultz


This paper presents recent progress toward our goal to enable area-wide pre-screening methods for the early detection of dementia based on automatically processing conversational speech of a representative group of more than 200 subjects. We focus on conversational speech since it is the natural form of communication that can be recorded unobtrusively, without adding stress to subjects and without the need of controlled clinical settings. We describe our unsupervised process chain consisting of voice activity detection and speaker diarization followed by extraction of features and detection of early signs of dementia. The unsupervised system achieves up to 0.645 unweighted average recall (UAR) and compares favorably to a system that was carefully designed on manually annotated data. To further lower the burden for subjects, we investigate UAR over speech duration and find that about 12 minutes of interview are sufficient to achieve the best UAR.


 DOI: 10.21437/Interspeech.2018-57

Cite as: Weiner, J., Angrick, M., Umesh, S., Schultz, T. (2018) Investigating the Effect of Audio Duration on Dementia Detection Using Acoustic Features. Proc. Interspeech 2018, 2324-2328, DOI: 10.21437/Interspeech.2018-57.


@inproceedings{Weiner2018,
  author={Jochen Weiner and Miguel Angrick and Srinivasan Umesh and Tanja Schultz},
  title={Investigating the Effect of Audio Duration on Dementia Detection Using Acoustic Features},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={2324--2328},
  doi={10.21437/Interspeech.2018-57},
  url={http://dx.doi.org/10.21437/Interspeech.2018-57}
}