This paper presents a method for speaker activity detection in small meetings. The activity of the participants is deduced from audio streams obtained by multiple microphone arrays. One of the novelty of the proposed approach is that it uses a human tracker that relies on scanning laser range finders to localize the participants. First, this additional information is exploited by the beamforming algorithm creating the audio streams for each of the microphone arrays. Then, at each array, the speaker activity detection is performed using Gaussian mixture models that were trained before hand. Finally, a fusion procedure, that also uses the location information, combines the detection results of the different microphone arrays. An experiment reproducing a meeting configuration demonstrates the effectiveness of the system.
Bibliographic reference. Even, Jani / Heracleous, Panikos / Ishi, Carlos T. / Hagita, Norihiro (2011): "Range based multi microphone array fusion for speaker activity detection in small meetings", In INTERSPEECH-2011, 2737-2740.