![]() |
INTERSPEECH 2011
|
![]() |
Acoustic speaker diarization is investigated for situations where a collection of shows from the same source needs to be processed. In this case, the same speaker should receive the same label across all shows. We compare different architectures for cross-show speaker diarization: the obvious concatenation of all shows, a hybrid system combining first a local clustering stage followed by a global clustering stage, and an incremental system which processes the shows in a predefined order and updates the speaker models accordingly. This latter system being best suited to real applicative situations. These three strategies were compared to a baseline single-show system on a set of 46 ten-minutes samples of British English scientific podcasts.
Bibliographic reference. Tran, Viet-Anh / Le, Viet Bac / Barras, Claude / Lamel, Lori (2011): "Comparing multi-stage approaches for cross-show speaker diarization", In INTERSPEECH-2011, 1053-1056.