5th International Conference on Spoken Language Processing
One of the great today's challenges in speech recognition is to ensure the robustness of the used speech representation. Usually, the recognition rate is strongly reduced when the speech is corrupted, e.g. by convolutional or additive noise, and the speech features are not designed to be robust. In this paper we study the effect of additive noise on the logarithmic filter-bank energy representation. We use time and frequency filtering techniques to emphasize the discriminative information and to reduce the mismatch between noisy and clean speech representation. A 2-D spectral representation is introduced to see the regions most affected by noise in the 2-D quefrency-modulation frequency domain and to help to design the frequency and time filter shapes. Experiments with one and two dynamic feature sets show the usefulness of the combination of time and frequency filtering for both, white and low-pass noise speech recognition. At the end the power time and frequency filtering technique is presented.
Bibliographic reference. Macho, Dusan / Nadeu, Climent (1998): "On the interaction between time and frequency filtering of speech parameters for robust speech recognition", In ICSLP-1998, paper 1137.