Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Noise Robust Speech Recognition Using a Dynamic-Cepstrum

Kiyoaki Aikawa (1), Tsuyoshi Saito (2)

(1) ATR Human Information Processing Research Laboratories, Kyoto, Japan (2) Toyohashi Univ. of Technology, Aichi, Japan

Noise robust speech recognition is achieved using a dynamic-cepstrum. The dynamic-cepstrum is a new spectral representation incorporating time-frequency forward masking. The time-frequency masking can suppress the spectral components commonly included in the current spectrum and in the preceding spectra. This feature suggests the applicability of the dynamic-cepstrum to noisy speech recognition. Speaker-dependent and speaker-independent phoneme recognition experiments are conducted using hidden Markov models. Experimental results demonstrate that the dynamic-cepstrum outperforms the conventional cepstrum on robustness against stationary noise and amplitude-modulated noise. The dynamic-cepstrum is also found to be superior to the conventional cepstrum combined with a delta-cepstrum.

