10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Target Speech GMM-Based Spectral Compensation for Noise Robust Speech Recognition

Takahiro Shinozaki, Sadaoki Furui

Tokyo Institute of Technology, Japan

To improve speech recognition performance in adverse conditions, a noise compensation method is proposed that applies a transformation in the spectral domain whose parameters are optimized based on likelihood of speech GMM modeled on the feature domain. The idea is that additive and convolutional noises have mathematically simple expression in the spectral domain while speech characteristics are better modeled in the feature domain such as MFCC. The proposed method works as a feature extraction front-end that is independent from decoding engine, and has ability to compensate for non-stationary additive and convolutional noises with a short time delay. It includes spectral subtraction as a special case when no parameter optimization is performed. Experiments were performed using the AURORA-2J database. It has been shown that significantly higher recognition performance is obtained by the proposed method than spectral subtraction.

Full Paper

Bibliographic reference.  Shinozaki, Takahiro / Furui, Sadaoki (2009): "Target speech GMM-based spectral compensation for noise robust speech recognition", In INTERSPEECH-2009, 1255-1258.