13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Compensation of Intrinsic Variability with Factor Analysis Modeling for Robust Speaker Verification

Sheng Chen, Mingxing Xu

Key Laboratory of Pervasive Computing, Ministry of Education, Tsinghua National Laboratory for Information Science and Technology (TNList), Department of Computer Science and Technology, Tsinghua University, Beijing, China

Performances of speaker verification systems are adversely affected by intrinsic variability in the real world applications. In this paper, factor analysis approaches of Joint Factor Analysis (JFA) and i-vector modeling are used to address the effects of intrinsic variations for robust speaker verification. The speaker variability and intrinsic variability are modeled with the speaker and session factors respectively in the JFA approach. In the i-vector framework, a low-dimensional space is defined to model the total variability and intrinsic variations are compensated with a variety of techniques including Linear Discriminant Analysis (LDA), Within-Class Covariance Normalization (WCCN) and Nuisance Attribute Projection (NAP). Experiments in the intrinsic variation corpus show that factor analysis approaches of JFA and i-vector framework perform much better than the GMM-UBM paradigm in modeling the intrinsic variability. Relative reductions in Error Equal Rate (EER) of around 39.85% and 36.76% are obtained respectively for JFA and i-Vector+LDA+WCCN speaker verification systems, compared to the GMM-UBM baseline system.

Index Terms: speaker verification, intrinsic variability, joint factor analysis, i-vector, LDA, WCCN, NAP

Full Paper

Bibliographic reference.  Chen, Sheng / Xu, Mingxing (2012): "Compensation of intrinsic variability with factor analysis modeling for robust speaker verification", In INTERSPEECH-2012, 1576-1579.