8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


A New Perspective on Feature Extraction for Robust In-Vehicle Speech Recognition

Umit H. Yapanel, John H.L. Hansen

University of Colorado at Boulder, USA

The problem of reliable speech recognition for in-vehicle applications has recently emerged as a challenging research domain. This study focuses on the feature extraction stage of this problem. The approach is based on MinimumVariance Distortionless Response (MVDR) spectrum estimation. MVDR is used for robustly estimating the envelope of the speech signal and shown to be very accurate and relatively less sensitive to additive noise. The proposed feature estimation process removes the traditional Mel-scaled filterbank as a perceptually motivated frequency partitioning. Instead, we directly warp the FFT power spectrum of speech. The word error rate (WER) is shown to decrease by 27.3% with respect to the MFCCs and 18.8% with respect to recently proposed PMCCs on an extended digit recognition task in real car environments. The proposed feature estimation approach is called PMVDR and conclusively shown to be a better speech representation in real environments with emphasis on time-varying car noise.

Full Paper

Bibliographic reference.  Yapanel, Umit H. / Hansen, John H.L. (2003): "A new perspective on feature extraction for robust in-vehicle speech recognition", In EUROSPEECH-2003, 1281-1284.