Speaker orientation estimation based on hybridation of GCC-PHAT and HLBR

Carlos Segura, Alberto Abad, Javier Hernando, Climent Nadeu

This paper presents a novel approach to speaker orientation estimation in a SmartRoom environment equipped with multiple microphones. The ratio between the high and low band energies (HLBR) received at each microphone has been shown in our previous work to be a potentially approach to estimate the direction of the voice produced by a speaker. In this work, for each microphone pair, a smoothed CPS phase is obtained by a proper windowing of the main peak of the cross-correlation sequence estimated with the GCC-PHAT method, and a HLBR is computed from the processed CPS. The proposed method keeps the computational simplicity of the HLBR algorithm while adding the robustness offered by the GCC-PHAT technique. Experimental preliminary results were conducted over a database recorded purposely in the UPC Smart room, and over the CLEAR head pose database. The proposed method performs consistently better than other state-of-the-art techniques with both databases.

Cite as: Segura, C., Abad, A., Hernando, J., Nadeu, C. (2008) Speaker orientation estimation based on hybridation of GCC-PHAT and HLBR. Proc. Interspeech 2008, 1325-1328, doi: 10.21437/Interspeech.2008-387

