EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Automatic Speech Recognition with Sparse Training Data for Dysarthric Speakers

Phil Green, James Carmichael, Athanassios Hatzis, Pam Enderby, Mark Hawley, Mark Parker

University of Sheffield, U.K.

We describe an unusual ASR application: recognition of command words from severely dysarthric speakers, who have poor control of their articulators. The goal is to allow these clients to control assistive technology by voice. While this is a small vocabulary, speaker-dependent, isolated-word application, the speech material is more variable than normal, and only a small amount of data is available for training. After training a CDHMM recogniser, it is necessary to predict its likely performance without using an independent test set, so that confusable words can be replaced by alternatives. We present a battery of measures of consistency and confusability, based on forced-alignment, which can be used to predict recogniser performance. We show how these measures perform, and how they are presented to the clinicians who are the users of the system.

Full Paper

Bibliographic reference.  Green, Phil / Carmichael, James / Hatzis, Athanassios / Enderby, Pam / Hawley, Mark / Parker, Mark (2003): "Automatic speech recognition with sparse training data for dysarthric speakers", In EUROSPEECH-2003, 1189-1192.