8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


A Harmonic-Model-Based Front End for Robust Speech Recognition

Michael L. Seltzer (1), Jasha Droppo (2), Alex Acero (2)

(1) Carnegie Mellon University, USA
(2) Microsoft Research, USA

Speech recognition accuracy degrades significantly when the speech has been corrupted by noise, especially when the system has been trained on clean speech. Many compensation algorithms have been developed which require reliable online noise estimates or a priori knowledge of the noise. In situations where such estimates or knowledge is difficult to obtain, these methods fail. We present a new robustness algorithm which avoids these problems by making no assumptions about the corrupting noise. Instead, we exploit properties inherent to the speech signal itself to denoise the recognition features. In this method, speech is decomposed into harmonic and noise-like components, which are then processed independently and recombined. By processing noise-corrupted speech in this manner we achieve significant improvements in recognition accuracy on the Aurora 2 task.

Full Paper

Bibliographic reference.  Seltzer, Michael L. / Droppo, Jasha / Acero, Alex (2003): "A harmonic-model-based front end for robust speech recognition", In EUROSPEECH-2003, 1277-1280.