INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Bayesian Extension of MUSIC for Sound Source Localization and Tracking

Takuma Otsuka (1), Kazuhiro Nakadai (2), Tetsuya Ogata (1), Hiroshi G. Okuno (1)

(1) Kyoto University, Japan
(2) Honda Research Institute Japan Co. Ltd., Japan

This paper presents a Bayesian extension of MUSIC-based sound source localization (SSL) and tracking method. SSL is important for distant speech enhancement and simultaneous speech separation for improving speech recognition, as well as for auditory scene analysis by mobile robots. One of the drawbacks of existing SSL methods is the necessity of careful parameter tunings, e.g., the sound source detection threshold depending on the reverberation time and the number of sources. Our contribution consists of (1) automatic parameter estimation in the variational Bayesian framework and (2) tracking of sound sources with reliability. Experimental results demonstrate our method robustly tracks multiple sound sources in a reverberant environment with RT20=840(ms).

Full Paper

Bibliographic reference.  Otsuka, Takuma / Nakadai, Kazuhiro / Ogata, Tetsuya / Okuno, Hiroshi G. (2011): "Bayesian extension of MUSIC for sound source localization and tracking", In INTERSPEECH-2011, 3109-3112.