Cross-Teager Cepstral Coefficients For Dysarthric Severity Level Classification

Anand Therattil, Aastha Kachhi, Hemant Patil

Dysarthria is a degenerative motor speech impairment, generally resulting into neurological damage in human body. This impairment causes the speech to be unintelligible to the humans, depending on the patient’s severity-level. Classification of dysarthric severity-level aids as a diagnostic tool to assess advancement of the patient’s condition, which also aids in dysarthric Automatic Speech Recognition (ASR), as the traditional ASR systems performs poorly on dysarthric speech. This study investigates the effect of Cross-Teager Energy Cepstral Coefficients (CTECC) on standard and statically meaningful UA-Speech corpus, which captures the energy-based signal from microphone array using the deep learning architecture, such as Convolutional Neural Network (CNN) with classification accuracy of 95.76%. The key objective of this thesis is to select optimal microphone (channel) with minimum amount of energy, which captures the maximum linguistic information of dysarthric speech. Additionally, the performance of CTECC feature is compared with Short-Time Fourier Transform (STFT)-based features, which gave classification accuracy of 91.76% on CNN classifier. Further, the Jaccard index, Matthew’s Correlation Coefficient (MCC), F1-score, and Hamming loss are used to examine feature discrimination power. Finally, we analyze the latency period for the proposed CTECC feature set for practical deployment of the classification system.

