INTERSPEECH 2010
11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

A Hybrid Approach to Online Speaker Diarization

Carlos Vaquero (1), Oriol Vinyals (2), Gerald Friedland (2)

(1) Universidad de Zaragoza, Spain
(2) ICSI, USA

This article presents a low-latency speaker diarization system (“who is speaking now?”) based on a hybrid approach that combines a traditional offline speaker diarization system (“who spoke when?”) with an online speaker identification system. The system fulfills all requirements of the diarization task, i.e. it does not need any a-priori information about the input, including no specific speaker models. After an initialization phase the approach allows a low-latency decision on the current speaker with an accuracy that is close to the underlying offline diarization system. The article describes the approach, evaluates the robustness of the system, and analyzes the latency/accuracy trade-off.

Full Paper

Bibliographic reference.  Vaquero, Carlos / Vinyals, Oriol / Friedland, Gerald (2010): "A hybrid approach to online speaker diarization", In INTERSPEECH-2010, 2638-2641.