9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Speech Recognition for Vocalized and Subvocal Modes of Production Using Surface EMG Signals from the Neck and Face

Geoffrey S. Meltzner (1), Jason Sroka (1), James T. Heaton (2), L. Donald Gilmore (3), Glen Colby (1), Serge Roy (3), Nancy Chen (1), Carlo J. De Luca (3)

(1) BAE Systems, USA; (2) Massachusetts General Hospital, USA; (3) Altec Inc., USA

We report automatic speech recognition accuracy for individual words using eleven surface electromyographic (sEMG) recording locations on the face and neck during three speaking modes: vocalized, mouthed, and mentally rehearsed. An HMM based recognition system was trained and tested on a 65 word vocabulary produced by 9 American English speakers in all three speaking modes. Our results indicate high sEMG-based recognition accuracy for the vocalized and mouthed speaking modes (mean rates of 92.1% and 86.7% respectively), but an inability to conduct recognition on mentally rehearsed speech due to a lack of sufficient sEMG activity.

Full Paper

Bibliographic reference.  Meltzner, Geoffrey S. / Sroka, Jason / Heaton, James T. / Gilmore, L. Donald / Colby, Glen / Roy, Serge / Chen, Nancy / Luca, Carlo J. De (2008): "Speech recognition for vocalized and subvocal modes of production using surface EMG signals from the neck and face", In INTERSPEECH-2008, 2667-2670.