9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Improved Novelty Detection for Online GMM Based Speaker Diarization

Konstantin Markov, Satoshi Nakamura

ATR-SLC, Japan

Detection of speakers which have not been seen before is an essential part of every online speaker diarization system. New speaker detection accuracy has direct impact on the overall diarization performance. In our previous system, for novelty detection we used global GMM likelihood ratio (LR) threshold. However, as the system analysis showed, the optimal threshold depends on the speaker gender as well as on the number of registered speakers. In this paper, we present the results of this analysis and the approach we have taken to solve this problem. First, we use different thresholds for male and female speakers, and second, for each gender before the thresholding we apply likelihood ratio mean and variance normalization. This greatly reduced the threshold dependency on the number of speakers and allowed to use fixed threshold for each gender. The LR distribution statistics are collected online and updated every time new likelihood ratio is calculated. Experiments on the TC-STAR database showed that compared with the previous global threshold method, the new novelty detection approach reduces the speaker diarization error rate up to 35%.

Full Paper

Bibliographic reference.  Markov, Konstantin / Nakamura, Satoshi (2008): "Improved novelty detection for online GMM based speaker diarization", In INTERSPEECH-2008, 363-366.