ITRW on Non-Linear Speech Processing (NOLISP 05)

Barcelona, Spain
April 19-22, 2005

Noise Robust Automatic Speech Recognition with Adaptive Quantile Based Noise Estimation and Speech Band Emphasizing Filter Bank

Casper Stork Bonde, Carina Graversen, Andreas Gregers Gregersen, Kim Hoang Ngo, Kim Nørmark, Mikkel Purup, Thomas Thorsen, Børge Lindberg

Department of Communication Technology, Aalborg University, Aalborg Ø, Denmark

An important topic in Automatic Speech Recognition (ASR) is to reduce the effect of noise, in particular when mismatch exists between the training and application conditions.

Many noise robutness schemes within the feature processing domain use as a prerequisite a noise estimate prior to the appearance of the speech signal which require noise robust voice activity detection and assumptions of stationary noise. However, both of these requirements are often not met and it is therefore of particular interest to investigate methods like the Quantile Based Noise Estimation (QBNE) mehtod which estimates the noise during speech and non-speech sections without the use of a voice activity detector. While the standard QBNE-method uses a fixed pre-defined quantile accross all frequency bands, this paper suggests adaptive QBNE (AQBNE) which adapts the quantile individually to each frequency band.

Furthermore the paper investigates an alternative to the standard mel frequency cepstral coefficient filter bank (MFCC), an empirically chosen Speech Band Emphasizing filter bank (SBE), which improves the resolution in the speech band.

The combinations of AQBNE and SBE are tested on the Danish Speech-Dat-Car database and compared to the performance achieved by the standards presented by the Aurora consortium (Aurora Baseline and Aurora Advanced Fronted). For the High Mismatch (HM) condition, the AQBNE achieves significantly better performance compared to the Aurora Baseline, both when combined with SBE and standard MFCC. AQBNE also outperforms the Aurora Baseline for the Medium Mismatch (MM) and Well Matched (WM) conditions. Though for all three conditions, the Aurora Advanced Frontend achieves superior performance, the AQBNE is still a relevant method to consider for small foot print applications.

Full Paper

Bibliographic reference.  Bonde, Casper Stork / Graversen, Carina / Gregersen, Andreas Gregers / Ngo, Kim Hoang / Nørmark, Kim / Purup, Mikkel / Thorsen, Thomas / Lindberg, Børge (2005): "Noise robust automatic speech recognition with adaptive quantile based noise estimation and speech band emphasizing filter bank", In NOLISP-2005, 275-286.