INTERSPEECH 2012
13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

PLDA Modeling in I-Vector and Supervector Space for Speaker Verification

Ye Jiang (1,2), Kong Aik Lee (2), Zhenmin Tang (1), Bin Ma (2), Anthony Larcher (2), Haizhou Li (2,3)

(1) School of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing, China
(2) Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore
(3) School of EE&T, University of New South Wales, Australia

In this paper, we advocate the use of uncompressed form of i-vector. We employ the probabilistic linear discriminant analysis (PLDA) to handle speaker and session variability for speaker verification task. An i-vector is a low-dimensional vector containing both speaker and channel information acquired from a speech segment. When PLDA is used on i-vector, dimension reduction is performed twice . first in the i-vector extraction process and second in the PLDA model. Keeping the full dimensionality of i-vector in the supervector space for PLDA modeling and scoring would avoid unnecessary loss of information. The drawback of using PLDA on uncompressed i-vector is the inversion of large matrices, which we show can be solved rather efficiently by portioning large matrix into smaller blocks. We also introduce the Gaussianized rank-norm, as an alternative to whitening, for feature normalization prior to PLDA modeling.

Index Terms: speaker verification, i-vector, probabilistic LDA

Full Paper

Bibliographic reference.  Jiang, Ye / Lee, Kong Aik / Tang, Zhenmin / Ma, Bin / Larcher, Anthony / Li, Haizhou (2012): "PLDA modeling in i-vector and supervector space for speaker verification", In INTERSPEECH-2012, 1680-1683.