Interspeech'2005 - Eurospeech
In this paper we propose a novel method for the detection of relevant changes in continuous acoustic stream. The aim is to identify the optimal number and position of the change-points that split the signal into shorter, more or less homogeneous sections. First we describe the theory we used to derive the segmentation algorithm. Then we show how this algorithm can be implemented efficiently. Evaluation is done on broadcast news data with the goal to segment it into parts belonging to different speakers. In simulated tests with artificially mixed utterances the algorithm identified 97.1% of all speaker changes with precision of 96.5%. In tests done with 30 hours of real broadcast news (in 9 languages) the average recall was 80% and precision 72.3%.
Bibliographic reference. Zdánský, Jindrich / Nouza, Jan (2005): "Detection of acoustic change-points in audio records via global BIC maximization and dynamic programming", In INTERSPEECH-2005, 669-672.