Voice mining with multiple target speakers

Ran Gazit, Yaakov Metzger

In the basic speaker verification task, an unknown voice segment that contains the voice of a single speaker is checked against the acoustic model of a single target speaker. In the multiple-speaker voice mining application, a large set of audio sessions is searched for the sessions of several target speakers. Each of the audio sessions may hold the voice of more than one speaker. This application should determine which of the sessions may come from any of the target speakers.

A multiple-speaker voice mining application, based on a sliding-window speaker detection engine, was designed and tested over speech corpora recorded under real-life conditions in two commercial call-centers. System design, parameters and test results are presented.

Cite as: Gazit, R., Metzger, Y. (2004) Voice mining with multiple target speakers. Proc. The Speaker and Language Recognition Workshop (Odyssey 2004), 377-380

