5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Coherence-based Subband Decomposition for Robust Speech and Speaker Recognition in Noisy and Reverberant Rooms

Joaquin Gonzalez-Rodriguez, Santiago Cruz-Llanas, Javier Ortega-Garcia

DIAC- Universidad Politecnica de Madrid, Spain

In this paper, the acoustic characteristics of sound fields in enclosed rooms are studied in the joint presence of speech and noise, in order to design a broadband microphone array system capable of coping with both coherent and diffuse noises. Several state-of-the-art speech enhancement array structures are presented and compared to our new system in terms of correct word recognition rates in a simple command and control task. The proposed structure, based on a broadband subband-nested array, performs real-time estimations of the spatial coherence in order to determine the coherent/diffuse nature of the different subbands, using different filters in each case, improving also the classical Wiener post-filter, typically used for diffuse noise supression, for proper cancellation of coherent noises. The results obtained with a 15-channel simultaneous recording database in different reverberation and noise conditions show better performance than other structures previously proposed.

Full Paper

Bibliographic reference.  Gonzalez-Rodriguez, Joaquin / Cruz-Llanas, Santiago / Ortega-Garcia, Javier (1998): "Coherence-based subband decomposition for robust speech and speaker recognition in noisy and reverberant rooms", In ICSLP-1998, paper 0064.