INTERSPEECH 2013

This paper presents a new speaker verification system based on ivector modeling as a feature extractor. In this modeling, we explore the distance constraints between ivector pairs from the same speaker and different speakers. With an approximation of the distance metric as a weighted covariance matrix of the top eigenvectors from the data covariance matrix, variational inference is used to estimate a posterior distribution for the distance metric. Given speaker labels, we select differentspeaker data pairs with the highest cosine scores to form a differentspeaker constraint set. This set captures the most discriminative betweenspeaker variability in the training data. This Bayesian distance metric learning approach achieves better performance than stateoftheart method. Furthermore, this approach is insensitive to score normalization, as compared to cosine scoring. Without the requirement of the number of labeled examples, this approach performs very well in the context of limited training data.
Bibliographic reference. Fang, Xiao / Dehak, Najim / Glass, James (2013): "Bayesian distance metric learning on ivector for speaker verification", In INTERSPEECH2013, 25142518.