12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Comparing Multi-Stage Approaches for Cross-Show Speaker Diarization

Viet-Anh Tran (1), Viet Bac Le (2), Claude Barras (1), Lori Lamel (1)

(1) LIMSI, France
(2) Vocapia Research, France

Acoustic speaker diarization is investigated for situations where a collection of shows from the same source needs to be processed. In this case, the same speaker should receive the same label across all shows. We compare different architectures for cross-show speaker diarization: the obvious concatenation of all shows, a hybrid system combining first a local clustering stage followed by a global clustering stage, and an incremental system which processes the shows in a predefined order and updates the speaker models accordingly. This latter system being best suited to real applicative situations. These three strategies were compared to a baseline single-show system on a set of 46 ten-minutes samples of British English scientific podcasts.

Full Paper

Bibliographic reference.  Tran, Viet-Anh / Le, Viet Bac / Barras, Claude / Lamel, Lori (2011): "Comparing multi-stage approaches for cross-show speaker diarization", In INTERSPEECH-2011, 1053-1056.