EUROSPEECH 2001 Scandinavia
This paper presents a novel noise robust front-end algorithm, evaluating its performance on the Aurora 2 database. Most noise robust algorithms for speech recognition assume stationary noise, i.e. that a noise estimate taken prior to the utterance will be accurate for the duration of that utterance. However, for non-stationary noises wherein the noise spectrum can change during the utterance, there can be substantial differences between the estimated and actual noise spectra for a given frame, resulting in poor performance. The algorithm presented here provides a continuous estimate of the noise, making use of the structure of the voiced speech spectrum to sample the gaps (or "tunnels") between the harmonic spectral peaks. Compared to the ETSI standard MFCC frontend, the proposed algorithm delivers an average improvement in performance of 43.93% on the Aurora 2 database.
Bibliographic reference. Ealey, Douglas / Kelleher, Holly / Pearce, David (2001): "Harmonic tunnelling: tracking non-stationary noises during speech", In EUROSPEECH-2001, 437-440.