SAPA-SCALE Conference 2012

Portland, OR, USA
September 7-8, 2012

Joint Detection and Localization of Multiple Speakers Using a Probabilistic Interpretation of the Steered Response Power

Youssef Oualil (1,2), Mathew Magimai-Doss (2), Friedrich Faubel (1), Dietrich Klakow (1)

(1) Spoken Language Systems, Saarland University, Saarbrücken, Germany
(2) Idiap Research Institute, CH-1920 Martigny, Switzerland

Detection and localization of multiple speakers in a noisy and reverberant environment is a fundamental and difficult task. In the literature, steered response power (SRP) based techniques are typically used to accomplish this task which can be computationally intensive. Nonetheless, the localization of multiple speakers remains a challenging in practice. In this paper, we present a novel approach based on a probabilistic interpretation of the SRP. The proposed method replaces the discrete search techniques by proposing an approximate analytical form of the SRP, which can adequately detect and localize multiple speakers. In addition to reliable detection and localization, the potential advantage of this approach is that it provides a probability density function (pdf) of the individual speaker positions rather than point estimates. Experiments on the AV16.3 corpus show the efficacy of the proposed approach.

Index Terms: Steered response power, Multiple speaker localization, Gaussian mixture

Full Paper

Bibliographic reference.  Oualil, Youssef / Magimai-Doss, Mathew / Faubel, Friedrich / Klakow, Dietrich (2012): "Joint detection and localization of multiple speakers using a probabilistic interpretation of the steered response power", In SAPA-SCALE-2012, 68-73.