COST278 and ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction

University of East Anglia, Norwich, UK
August 30-31, 2004

Robust Methods for EMG Signal Processing for Audio-EMG-based Multi-Modal Speech Recognition

Zhipeng Zhang, Hiroyuki Manabe, Tsutomu Horikoshi, Tomoyuki Ohya

Multimedia Laboratories, NTT DoCoMo, Yokosuka, Kanagawa, Japan

This paper proposes robust methods for processing EMG (electromyography) signals in the framework of audio-EMGbased speech recognition. The EMG signals are captured when uttered and used as auxiliary information for recognizing speech. Two robust methods (Cepstral Mean Normalization and Spectral Subtraction) for EMG signal processing are investigated to improve the recognition performance. We also investigate the importance of stream weighting in audio-EMG-based multimodal speech recognition. Experiments are carried out at various noise conditions and the results show the effectiveness of the proposed methods. A significant improvement in word accuracy over the audio-only recognition scheme is achieved by combining the methods.


Full Paper

Bibliographic reference.  Zhang, Zhipeng / Manabe, Hiroyuki / Horikoshi, Tsutomu / Ohya, Tomoyuki (2004): "Robust methods for EMG signal processing for audio-EMG-based multi-modal speech recognition", In Robust2004, paper 21.