8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Energy Contour Extraction for In-Car Speech Recognition

Tai-Hwei Hwang

Industrial Technology Research Institute, Taiwan

The time derivatives of speech energy, such as the delta and the delta-delta log energy, have been known as critical features for automatic speech recognition (ASR). However, their discriminative ability in lower signal-to-noise ratio (SNR) could be limited or even becomes harmful because of the corruption of energy contour. By taking the advantage of the spectral characteristic of in-car noise, the speech energy contour is extracted from the high-pass filtered signal so as to reduce the distortion in the delta energy. Such filtering can be implemented by using a pre-emphasis-like filter or a summation of higher frequency band energies. A Chinese name recognition task is conducted to evaluate the proposed method by using real in-car speech and artificially generated one as the test data. As shown in the experimental results, the method is capable of improving the recognition accuracy of in-car speech in lower SNR as well as of the clean speech.

