State-of-the-art factor analysis based channel compensation methods for speaker recognition are based on the assumption that speaker/ utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In this study, motivated by the low-rank covariance structure of cepstral features, we propose a factor analysis model in the acoustic feature space instead of the super-vector domain and derive a mixture dependent feature transformation. We demonstrate that, the proposed Acoustic Factor Analysis (AFA) transformation performs feature dimensionality reduction, de-correlation, variance normalization and enhancement at the same time. The transform applies a square-root Wiener gain on the acoustic feature eigenvector directions, and is similar to the signal sub-space based speech enhancement schemes. We also propose several methods of adaptively selecting the AFA parameter for each mixture. The proposed feature transform is applied using a probabilistic mixture alignment, and is integrated with a conventional i-Vector system. Experimental results on the telephone trials of the NIST SRE 2010 demonstrate the effectiveness of the proposed scheme.
Bibliographic reference. Hasan, Taufiq / Hansen, John H. L. (2012): "Integrated feature normalization and enhancement for robust speaker recognition using acoustic factor analysis", In INTERSPEECH-2012, 1568-1571.