Alzheimer's Dementia Detection from Audio and Language Modalities in Spontaneous Speech

Edward L. Campbell, Laura Docio-Fernandez, Javier Jiménez-Raboso, Carmen Gacia-Mateo

Automatic detection of Alzheimer's dementia (AD) by speech processing is enhanced when features of both the acoustic waveform and the content are extracted. Audio and text transcription have been widely used in health-related tasks, as spectral and prosodic speech features, as well as semantic and linguistic content, convey information about various diseases. Hence, this paper describes and compares the performance of different Alzheimer's disease detection approaches based on both the patient’s voice and message transcription. To this effect, five different individual systems are analysed: three of them are speech-based and the other two systems are text-based. Specifically, as speech-based systems the x-vector and i-vector paradigm to characterise speech, and a set of rhythmic-based hand-crafted features are proposed. And, for transcription analysis, two systems are proposed, one which uses pre-trainded BERT models and the other which uses knowledge-based linguistic and language modelling features. Also, to examine if acoustic and content features are complementary intra-modality and inter-modality score fusion strategies are studied. Experiments in the framework of Interspeech 2020 ADReSS challenge show that the BERT-based system outperforms other individual systems for the AD detection task. Furthermore, the fusion of acoustic- and transcription-based systems provides the best result, suggesting that the two modalities are complementary to some extent.

doi: 10.21437/IberSPEECH.2021-57

Campbell, E.L, Docio-Fernandez, L, Jiménez-Raboso, J, Gacia-Mateo, C (2021) Alzheimer's Dementia Detection from Audio and Language Modalities in Spontaneous Speech. Proc. IberSPEECH 2021, 270-274, doi: 10.21437/IberSPEECH.2021-57.