Semi-supervised On-line Speaker Diarization for Meeting Data with Incremental Maximum A-posteriori Adaptation

Giovanni Soldi, Massimiliano Todisco, Héctor Delgado, Christophe Beaugeant, Nicholas Evans


Almost all current diarization systems are off-line and ill-suited to the growing need for on-line or real-time diarization. Our previous work reported the first on-line diarization system for the most challenging speaker diarization domain involving meeting data. Even if results were not dissimilar to those reported for on-line diarization in less challenging domains, error rates were high and unlikely to support any practical applications. The first novel contribution in this paper relates to the investigation of a semi-supervised approach to on-line diarization whereby speaker models are seeded with a modest amount of manually labelled data. In practical applications involving meetings, such data can be obtained readily from brief roundtable introductions. The second novel contribution relates to a incremental MAP adaptation procedure for efficient, on-line speaker modelling. When combined, these two developments provide an on-line diarization system which outperforms a baseline, off-line system by a significant margin. When configured appropriately, error rates may be low enough to support practical applications.


DOI: 10.21437/Odyssey.2016-55

Cite as

Soldi, G., Todisco, M., Delgado, H., Beaugeant, C., Evans, N. (2016) Semi-supervised On-line Speaker Diarization for Meeting Data with Incremental Maximum A-posteriori Adaptation. Proc. Odyssey 2016, 377-384.

Bibtex
@inproceedings{Soldi+2016,
author={Giovanni Soldi and Massimiliano Todisco and Héctor Delgado and Christophe Beaugeant and Nicholas Evans},
title={Semi-supervised On-line Speaker Diarization for Meeting Data with Incremental Maximum A-posteriori Adaptation},
year=2016,
booktitle={Odyssey 2016},
doi={10.21437/Odyssey.2016-55},
url={http://dx.doi.org/10.21437/Odyssey.2016-55},
pages={377--384}
}