Sixth European Conference on Speech Communication and Technology
(EUROSPEECH'99)

Budapest, Hungary
September 5-9, 1999

Fast Speaker Change Detection for Broadcast News Transcription and Indexing

Daben Liu, Francis Kubala

BBN Technologies, GTE Corporation, Cambridge, MA, USA

In this paper, we describe a new speaker change detection algorithm designed for fast transcription and audio indexing of spoken broadcast news. We have designed a two-stage algorithm that begins with a gender-independent phone-class recognition pass. We collapse the phoneme inventory to only 4 broad classes and include 4 different models for non-speech, resulting in a small fast decoder that runs in less than 0.1 times real-time. The second stage of the SCD algorithm hypothesizes a speaker change boundary between every phone in the labeled input. The phone level time resolution in our approach permits the algorithm to run quickly while maintaining the same accuracy as a frame level approach. Applying the new algorithms to a large sample of broadcast news programs resulted in improvements in speaker change detection accuracy, speech recognition accuracy, and speed.


Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Liu, Daben / Kubala, Francis (1999): "Fast speaker change detection for broadcast news transcription and indexing", In EUROSPEECH'99, 1031-1034.