ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Clustering initialization based on spatial information for speaker diarization of meetings

J. Luque, Carlos Segura, Javier Hernando

This paper proposes an initialization for an agglomerative system applied to speaker diarization in the meeting environment. The initialization is based on a previous clustering of the temporal sequence generated by the estimation of the Time Delay of Arrival (TDOA) among pair of sensors. That initial clustering has the purpose of obtaining initial classes with speaker information from a sole speaker. The aim is to ensure the purity of the initial segments based on the position of the speakers in a meeting along time. The TDOA initialization was tested with the dataset used in the RT07s evaluation where an improvement of the diariazation error rate is obtained with respect to the classical uniform initialization. The most of the experiments show that the purity of the beginning segments leads to a better clustering on the posterior hierarchical strategy based on cepstral features.


doi: 10.21437/Interspeech.2008-154

Cite as: Luque, J., Segura, C., Hernando, J. (2008) Clustering initialization based on spatial information for speaker diarization of meetings. Proc. Interspeech 2008, 383-386, doi: 10.21437/Interspeech.2008-154

@inproceedings{luque08_interspeech,
  author={J. Luque and Carlos Segura and Javier Hernando},
  title={{Clustering initialization based on spatial information for speaker diarization of meetings}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={383--386},
  doi={10.21437/Interspeech.2008-154}
}