9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Speaker Orientation Estimation Based on Hybridation of GCC-PHAT and HLBR

Carlos Segura (1), Alberto Abad (2), Javier Hernando (1), Climent Nadeu (1)

(1) Universitat Politècnica de Catalunya, Spain; (2) INESC-ID/IST, Portugal

This paper presents a novel approach to speaker orientation estimation in a SmartRoom environment equipped with multiple microphones. The ratio between the high and low band energies (HLBR) received at each microphone has been shown in our previous work to be a potentially approach to estimate the direction of the voice produced by a speaker. In this work, for each microphone pair, a smoothed CPS phase is obtained by a proper windowing of the main peak of the cross-correlation sequence estimated with the GCC-PHAT method, and a HLBR is computed from the processed CPS. The proposed method keeps the computational simplicity of the HLBR algorithm while adding the robustness offered by the GCC-PHAT technique. Experimental preliminary results were conducted over a database recorded purposely in the UPC Smart room, and over the CLEAR head pose database. The proposed method performs consistently better than other state-of-the-art techniques with both databases.

Bibliographic reference.  Segura, Carlos / Abad, Alberto / Hernando, Javier / Nadeu, Climent (2008): "Speaker orientation estimation based on hybridation of GCC-PHAT and HLBR", In INTERSPEECH-2008, 1325-1328.