ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

Sound Source Localization with Majorization Minimization

Masahito Togami, Robin Scheibler

We propose a sound source localization technique that estimates a speech source location without precise grid searching. The source location is estimated in a parameter optimization manner to minimize the steered-response power (SRP) function with the near-field assumption. Because there is no closed-form solution for the SRP function, we introduce an auxiliary function of the SRP function based on the majorization-minimization (MM) algorithm. Parameters are updated iteratively to minimize the auxiliary function with alternate execution of time-difference-of-arrival (TDOA) estimation and range-difference (RD) based localization. When TDOA estimation and RD-based localization are performed in a cascade manner, the estimation accuracy of the source location is strongly affected by the estimation accuracy of the TDOA. On contrary, the proposed method corrects the estimated TDOA by referring to the estimated source location in the previous iteration. Thus, it is expected for the proposed method to be robust against TDOA estimation error which occurs under reverberant environments. Experimental results show that the proposed method outperforms conventional techniques under a reverberant environment.

doi: 10.21437/Interspeech.2021-126

Cite as: Togami, M., Scheibler, R. (2021) Sound Source Localization with Majorization Minimization. Proc. Interspeech 2021, 2122-2126, doi: 10.21437/Interspeech.2021-126

  author={Masahito Togami and Robin Scheibler},
  title={{Sound Source Localization with Majorization Minimization}},
  booktitle={Proc. Interspeech 2021},