11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Multi Resolution Discriminative Models for Subvocalic Speech Recognition

Mark Raugas, Vivek Kumar Rangarajan Sridhar, Rohit Prasad, Prem Natarajan

Raytheon BBN Technologies, USA

In this work, we investigate the use of discriminative models for automatic speech recognition of subvocalic speech via surface electromyography (sEMG). We also investigate the suitability of multiresolution analysis in the form of discrete wavelet transform (DWT) for sEMG-based speech recognition. We examine appropriate dimensionality reduction techniques for features extracted using different wavelet families and compare our results with the conventional mel-frequency cepstral coefficients (MFCC) used in speech recognition. Our results indicate that a simple model fusion between cepstral and wavelet domain features can achieve superior recognition performance. Fusing the MFCC and wavelet based SVM models using principal component analysis for feature reduction yields the best performance, with a mean accuracy of 95.13% over a set of nine speakers on a 65 word closed vocabulary task.

Full Paper

Bibliographic reference.  Raugas, Mark / Sridhar, Vivek Kumar Rangarajan / Prasad, Rohit / Natarajan, Prem (2010): "Multi resolution discriminative models for subvocalic speech recognition", In INTERSPEECH-2010, 2626-2629.