INTERSPEECH 2007
8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Speaker Diarization Using Normalized Cross Likelihood Ratio

Viet-Bac Le, Odile Mella, Dominique Fohr

LORIA, France

In this paper, we present the Normalized Cross Likelihood Ratio (NCLR) and the advantages of using it in a speaker diarization system. First, the NCLR is used as a dissimilarity measure between two Gaussian speaker models in the speaker change detection step and its contribution to the performance of speaker change detection is compared with those of BIC and Hostelling's T2-Statistic measures. Then, the NCLR measure is modified to deal with multi-gaussian adapted models in the cluster recombination step. This step ends the step-by-step speaker diarization process after the BIC-based hierarchical clustering and the Viterbi re-segmentation steps. By comparing the NCLR measure with the CLR (Cross Likelihood Ratio) one, more than 30% of relative diarization error is reduced in ESTER evaluation data.

Full Paper

Bibliographic reference.  Le, Viet-Bac / Mella, Odile / Fohr, Dominique (2007): "Speaker diarization using normalized cross likelihood ratio", In INTERSPEECH-2007, 1869-1872.