Recognition of Dysarthric Speech Using Voice Parameters for Speaker Adaptation and Multi-Taper Spectral Estimation

Chitralekha Bhat, Bhavik Vachhani, Sunil Kopparapu


Dysarthria is a motor speech disorder resulting from impairment in muscles responsible for speech production, often characterized by slurred or slow speech resulting in low intelligibility. With speech based applications such as voice biometrics and personal assistants gaining popularity, automatic recognition of dysarthric speech becomes imperative as a step towards including people with dysarthria into mainstream. In this paper we examine the applicability of voice parameters that are traditionally used for pathological voice classification such as jitter, shimmer, F0 and Noise Harmonic Ratio (NHR) contour in addition to Mel Frequency Cepstral Coefficients (MFCC) for dysarthric speech recognition. Additionally, we show that multi-taper spectral estimation for computing MFCC improves the unseen dysarthric speech recognition. A Deep neural network (DNN) - hidden Markov model (HMM) recognition system fared better than a Gaussian Mixture Model (GMM) - HMM based system for dysarthric speech recognition. We propose a method to optimally use incremental dysarthric data to improve dysarthric speech recognition for an ASR with DNN-HMM. All evaluations were done on Universal Access Speech Corpus.


DOI: 10.21437/Interspeech.2016-1085

Cite as

Bhat, C., Vachhani, B., Kopparapu, S. (2016) Recognition of Dysarthric Speech Using Voice Parameters for Speaker Adaptation and Multi-Taper Spectral Estimation. Proc. Interspeech 2016, 228-232.

Bibtex
@inproceedings{Bhat+2016,
author={Chitralekha Bhat and Bhavik Vachhani and Sunil Kopparapu},
title={Recognition of Dysarthric Speech Using Voice Parameters for Speaker Adaptation and Multi-Taper Spectral Estimation},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1085},
url={http://dx.doi.org/10.21437/Interspeech.2016-1085},
pages={228--232}
}