This paper presents a new speaker verification system based on i-vector modeling as a feature extractor. In this modeling, we explore the distance constraints between i-vector pairs from the same speaker and different speakers. With an approximation of the distance metric as a weighted covariance matrix of the top eigenvectors from the data covariance matrix, variational inference is used to estimate a posterior distribution for the distance metric. Given speaker labels, we select different-speaker data pairs with the highest cosine scores to form a different-speaker constraint set. This set captures the most discriminative between-speaker variability in the training data. This Bayesian distance metric learning approach achieves better performance than state-of-the-art method. Furthermore, this approach is insensitive to score normalization, as compared to cosine scoring. Without the requirement of the number of labeled examples, this approach performs very well in the context of limited training data.
Bibliographic reference. Fang, Xiao / Dehak, Najim / Glass, James (2013): "Bayesian distance metric learning on i-vector for speaker verification", In INTERSPEECH-2013, 2514-2518.