INTERSPEECH 2004 - ICSLP
8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Evolutive Speaker Segmentation using a Repository System

Xavier Anguera Miro, Javier Hernando Pericas

Technical University of Catalonia (UPC), Spain

When performing blind speaker segmentation one of the main problems is not knowing how many speakers appear in a conversation and wether they appear once or more than once. In this paper, an iterative method, which is based on the Evolutive-HMM is presented. Two main improvements to this system are introduced. On one hand, a repository generic speaker is used to model all utterances and all speaker models are derived from this iteratively. Different normalization of the scores are applied to the repository and the speakers to emphasize speaker changes. On the other hand, in all cases we use Gaussian Mixture Models (GMM) for their flexibility compared to an HMM structure. This method has been successfully tested using multi-speaker speech sequences generated by concatenation of speech segments from Speecon.

Full Paper

Bibliographic reference.  Miro, Xavier Anguera / Pericas, Javier Hernando (2004): "Evolutive speaker segmentation using a repository system", In INTERSPEECH-2004, 605-608.