8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

An Improved Speaker Diarization System

Rong Fu, Ian D. Benest

University of York, UK

This paper describes an automatic speaker diarization system for natural, multi-speaker meeting conversations. Only one central microphone is used to record the meeting. The new system is robust to different acoustic environments - it requires neither pre-training models nor development sets to initialize the parameters. The new system determines the model complexity automatically. It adapts the segment model from a universal background model, and uses the cross-likelihood ratio instead of the Bayesian Information Criterion (BIC) for merging. Finally it uses an intra-cluster/inter-cluster ratio as the stopping criterion. Together this reduces the speaker diarization error rate from 21.76% to 17.21% compared with the baseline system [1].

Full Paper

Bibliographic reference.  Fu, Rong / Benest, Ian D. (2007): "An improved speaker diarization system", In INTERSPEECH-2007, 2605-2608.