7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Unsupervised Speaker Segmentation of Telephone Conversations

Aaron E. Rosenberg, Allen Gorin, Zhu Liu, S. Parthasarathy

AT&T Labs - Research, USA

A process for segmenting 2-speaker telephone conversations by speaker with no prior speaker models is described and evaluated. The process consists of an initial segmentation using acoustic change and pause detection, segment clustering, and iterative modeling of segment clusters and resegmentation. The technique has been evaluated on (6), approximately 3 min long, customer care conversations. The technique does not resolve short (< 2 secs) or overlapping segments very well, but is capable of detecting longer segments (> 4 secs) with miss rates of the order of 10% and confusion rates 2% or less.


Full Paper

Bibliographic reference.  Rosenberg, Aaron E. / Gorin, Allen / Liu, Zhu / Parthasarathy, S. (2002): "Unsupervised speaker segmentation of telephone conversations", In ICSLP-2002, 565-568.