5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

On the Interaction Between Time and Frequency Filtering of Speech Parameters for Robust Speech Recognition

Dusan Macho (1), Climent Nadeu (2)

(1) Slovak Technical University and Slovak Academy of Sciences, Slovak Republic
(2) Universitat Politecnica de Catalunya, Spain

One of the great today's challenges in speech recognition is to ensure the robustness of the used speech representation. Usually, the recognition rate is strongly reduced when the speech is corrupted, e.g. by convolutional or additive noise, and the speech features are not designed to be robust. In this paper we study the effect of additive noise on the logarithmic filter-bank energy representation. We use time and frequency filtering techniques to emphasize the discriminative information and to reduce the mismatch between noisy and clean speech representation. A 2-D spectral representation is introduced to see the regions most affected by noise in the 2-D quefrency-modulation frequency domain and to help to design the frequency and time filter shapes. Experiments with one and two dynamic feature sets show the usefulness of the combination of time and frequency filtering for both, white and low-pass noise speech recognition. At the end the power time and frequency filtering technique is presented.

