5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Multiresolution Channel Normalization for ASR in Reverberant Environments

Carlos Avendano, Sangita Tibrewala, Hynek Hermansky

Department of Electrical Engineering Oregon Graduate Institute of Science & Technology Portland, Oregon, USA

To overcome the problems related with the long impulse responses produced by reverberation, we use a long time window (high frequency resolution) analysis during the channel normalization steps of the feature extraction process in automatic speech recognition (ASR). After normalization, a trade between frequency and time resolution is used to increase the rate at which the time information is sampled (short-time domain), yielding an appropriate domain to derive ASR features. Experiments on data with reverberation times of about 0.5 s show that the new technique achieves significant performance improvement of a speech recognizer under reverberation, with only some performance degradation on clean speech.

Full Paper

Bibliographic reference.  Avendano, Carlos / Tibrewala, Sangita / Hermansky, Hynek (1997): "Multiresolution channel normalization for ASR in reverberant environments", In EUROSPEECH-1997, 1107-1110.