Surface electromyography (EMG) can be used to record the activation potentials of articulatory muscles while a person speaks. This technique could enable silent speech interfaces, as EMG signals are generated even when people pantomime speech without producing sound. Having effective silent speech interfaces would enable a number of compelling applications, allowing people to communicate in areas where they would not want to be overheard or where the background noise is so prevalent that they could not be heard. In order to use EMG signals in speech interfaces, however, there must be a relatively accurate method to map the signals to speech.
Up to this point, it appears that most attempts to use EMG signals for speech interfaces have focused on Automatic Speech Recognition (ASR) based on features derived from EMG signals. Following the lead of other researchers who worked with Electro-Magnetic Articulograph (EMA) data and Non-Audible Murmur (NAM) speech, we explore the alternative idea of using Voice Transformation (VT) techniques to synthesize speech from EMG signals. With speech output, both ASR systems and human listeners can directly use EMG-based systems. We report the results of our preliminary studies, noting the difficulties we encountered and suggesting areas for future work.
Bibliographic reference. Toth, Arthur R. / Wand, Michael / Schultz, Tanja (2009): "Synthesizing speech from electromyography using voice transformation techniques", In INTERSPEECH-2009, 652-655.