ODYSSEY 2004 - The Speaker and Language Recognition Workshop

May 31 - June 3, 2004
Toledo, Spain

Unsupervised Speaker Segmentation of Broadcast News using MDL-Based Gaussian Model

Jia-Hsin Hsieh, Chung-Hsien Wu

Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan

This paper proposes an approach for unsupervised speaker segmentation and gender discrimination of broadcast news. In this paradigm, a speaker segmentation mechanism using MDL-based Gaussian model is firstly adopted to determine the speaker changes using mean and covariance of the Gaussian model. These speaker segments partitioned by speaker changes are smoothed and discriminated into male or female. Experimental results show the proposed method achieved a better performance with 9.2% missed detection rate and 7.5% false alarm rate compared to the Delta-BIC method for speaker segmentation on broadcast news. In addition, the segment-based gender discrimination improves 9% accuracy compared to the clip-based discriminator.

Full Paper

Bibliographic reference.  Hsieh, Jia-Hsin / Wu, Chung-Hsien (2004): "Unsupervised speaker segmentation of broadcast news using MDL-based Gaussian model", In ODYS-2004, 345-348.