Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Improved Speaker Segmentation and Segments Clustering Using the Bayesian Information Criterion

Alain Tritschler, Ramesh A. Gopinath

IBM T. J. Watson Research Center, Yorktown Heights, NY, USA

Detection of speaker, channel and environment changes in a continuous audio stream is important invarious applications (e.g., broadcast news, meetings/teleconferences etc.). Standard schemes for segmentation use a classifier and hence do not generalize to unseen speaker / channel / environments. Recently S.Chen introduced new segmentation and clustering algorithms, using the so-called BIC. This paper presents more accurate and more eficient variants of the BIC scheme for segmentation and clustering. Specifically, the new algorithms improve the speed and accuracy of segmentation and clustering and allow for a real-time implementation of simultaneous transcription, segmentation and speaker tracking.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Tritschler, Alain / Gopinath, Ramesh A. (1999): "Improved speaker segmentation and segments clustering using the bayesian information criterion", In EUROSPEECH'99, 679-682.