ISCA Archive Odyssey 2014
ISCA Archive Odyssey 2014

Hierarchical speaker clustering methods for the NIST i-vector Challenge

Marc Ferras, Elie Khoury, S├ębastien Marcel, Laurent El Shafey

The process of manually labeling data is very expensive and sometimes infeasible due to privacy and security issues. This paper investigates the use of two algorithms for clustering unlabeled training i-vectors. This aims at improving speaker recognition performance by using state-of-the-art supervised techniques in the context of the NIST i-vector Machine Learning Challenge 2014. The first algorithm is the well-known Ward clustering that aims at optimizing an objective function across all clusters. The second one is a cascade clustering, which benefits from the latest advances in speaker modeling and session compensation techniques, and relies on both the cosine similarity and probabilistic linear discriminant analysis (PLDA). Furthermore, this paper investigates the multi-clustering fusion that opens the door for further improvements. The experimental results show that the use of the automatically labeled i-vectors to train supervised methods such as LDA, PLDA or linear logistic regression-based fusion, decreases the minimum decision cost function by up to 22%.


doi: 10.21437/Odyssey.2014-38

Cite as: Ferras, M., Khoury, E., Marcel, S., Shafey, L.E. (2014) Hierarchical speaker clustering methods for the NIST i-vector Challenge. Proc. The Speaker and Language Recognition Workshop (Odyssey 2014), 254-259, doi: 10.21437/Odyssey.2014-38

@inproceedings{ferras14_odyssey,
  author={Marc Ferras and Elie Khoury and S├ębastien Marcel and Laurent El Shafey},
  title={{Hierarchical speaker clustering methods for the NIST i-vector Challenge}},
  year=2014,
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2014)},
  pages={254--259},
  doi={10.21437/Odyssey.2014-38}
}