12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Generalized-Log Spectral Mean Normalization for Speech Recognition

Hilman F. Pardede, Koichi Shinoda

Tokyo Institute of Technology, Japan

Most compensation methods for robust speech recognition against noise assume independency between speech, additive and convolutive noise. However, the nonlinear nature distortion caused by noise may introduce correlation between noise and speech. To tackle this issue, we propose generalized-log spectral mean normalization (GLSMN) in which log spectral mean normalization (LSMN) is carried out in the q-logarithmic domain. Experiments on the Aurora-2 database show that GLSMN improved speech recognition accuracies by 20% compared to cepstral mean normalization (CMN) in mel-frequency domain.

Full Paper

Bibliographic reference.  Pardede, Hilman F. / Shinoda, Koichi (2011): "Generalized-log spectral mean normalization for speech recognition", In INTERSPEECH-2011, 1645-1648.