9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

In-Car Speech Recognition Using Model-Based Wiener Filter and Multi-Condition Training

Masanori Tsujikawa, Takayuki Arakawa, Ryosuke Isotani

NEC Corporation, Japan

This paper presents in-car speech recognition using a model-based Wiener filter (MBW) and multi-condition (MC) training. The MBW is a 2-step denoising algorithm based on both rough and precise estimation of speech signals. Correcting roughly estimated signals with a Gaussian mixture model (GMM) makes it possible to accurately denoise with little computational cost. In an evaluation of in-car speech recognition, training of both a GMM and a back-end hidden Markov model (HMM) was performed using both studio-recorded speech signals as well as those signals mixed with in-car noise signals that were recorded in real car environments. In-car speech signals for testing were recorded with a plurality of microphones in different car environments. With respect to word accuracy obtained with MC-trained HMM, it was confirmed that the MBW with MC-trained GMM outperformed the Noise Reduction in ETSI advanced front-end.

Full Paper

Bibliographic reference.  Tsujikawa, Masanori / Arakawa, Takayuki / Isotani, Ryosuke (2008): "In-car speech recognition using model-based wiener filter and multi-condition training", In INTERSPEECH-2008, 972-975.