Odyssey 2012 - The Speaker and Language Recognition Workshop

Singapore
June 25-28, 2012

Factor Analysis of Mixture of Auto-Associative Neural Networks for Speaker Verification

Sri Garimella, Hynek Hermansky

Center for Language and Speech Processing Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, USA

This paper introduces the theory of factor analysis of the mixture of Auto-Associative Neural Networks (AANNs) with application in speaker verification. First, we formulate the problem of learning a low-dimensional subspace in part of the mixture of AANNs parameter space, and subsequently derive the update equations by minimizing loss function of the mixture. Second, we apply this technique to build a neural network based speaker verification system, in which the low-dimensional subspace is trained to capture both speaker and channel variabilities. This low-dimensional (or i-vector) representation is used as features for the probabilistic linear discriminant analysis (PLDA) model, as in state-of-the-art speaker verification systems. The proposed factor analysis approach shows promising results on the NIST-08 speaker recognition evaluation (SRE), and yields 18% relative improvement in minimum detection cost function (minDCF) over the previously proposed subspace based mixture of AANNs system.

Full Paper

Bibliographic reference.  Garimella, Sri / Hermansky, Hynek (2012): "Factor analysis of mixture of auto-associative neural networks for speaker verification", In Odyssey-2012, 92-97.