Diagnosing Dysarthria with Long Short-Term Memory Networks

Alex Mayle, Zhiwei Mou, Razvan Bunescu, Sadegh Mirshekarian, Li Xu, Chang Liu

This paper proposes the use of Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units for determining whether Mandarin-speaking individuals are afflicted with a form of Dysarthria based on samples of syllable pronunciations. Several LSTM network architectures are evaluated on this binary classification task, using accuracy and Receiver Operating Characteristic (ROC) curves as metrics. The LSTM models are shown to significantly improve upon a baseline fully connected network, reaching over 90% area under the ROC curve on the task of classifying new speakers, when a sufficient number of cepstrum coefficients are used. The results show that the LSTM’s ability to leverage temporal information within its input makes for an effective step in the pursuit of accessible Dysarthria diagnoses.

 DOI: 10.21437/Interspeech.2019-2903

Cite as: Mayle, A., Mou, Z., Bunescu, R., Mirshekarian, S., Xu, L., Liu, C. (2019) Diagnosing Dysarthria with Long Short-Term Memory Networks. Proc. Interspeech 2019, 4514-4518, DOI: 10.21437/Interspeech.2019-2903.

  author={Alex Mayle and Zhiwei Mou and Razvan Bunescu and Sadegh Mirshekarian and Li Xu and Chang Liu},
  title={{Diagnosing Dysarthria with Long Short-Term Memory Networks}},
  booktitle={Proc. Interspeech 2019},