INTERSPEECH 2012
13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

GCC-PHAT based Head Orientation Estimation

Carlos Segura (1,2), Javier Hernando (1)

(1) Universitat Politècnica de Catalunya, Barcelona, Spain
(2) Herta Security, S.L., Barcelona, Spain

This work presents a novel two-step algorithm to estimate the orientation of speakers in a smart-room environment equipped with microphone arrays. First the position of the speaker is estimated by the SRP-PHAT algorithm, and the time delay of arrival for each microphone pair with respect to the detected position is computed. In the second step, the value of the cross-correlation at the estimated time delay is used as the fundamental characteristic from where to derive the speaker orientation. The proposed method performs consistently better than other state-of-the-art acoustic techniques with a purposely recorded database and the CLEAR head pose database.

Index Terms: Head pose; speaker orientation; acoustic source localization

Full Paper

Bibliographic reference.  Segura, Carlos / Hernando, Javier (2012): "GCC-PHAT based head orientation estimation", In INTERSPEECH-2012, 1740-1743.