Robustness due to mismatched train/test conditions is one of the biggest challenges facing speaker recognition today, with transmission channel/handset and additive noise distortion being the most prominent factors. One limitation of the recent speaker recognition systems is that they are based on a latent factor analysis modeling of the GMM mean super-vectors alone. Motivated by the covariance structure of cepstral features, in this study, we develop a factor analysis model in the acoustic feature space instead of the super-vector domain. The proposed technique computes a mixture dependent feature dimensionality reduction transform and is directly applied to the first order Baum-Welch statistics for effective integration with a conventional i-vector-PLDA system. Experimental results on the telephone trials of the NIST SRE 2010 demonstrate the superiority of the proposed scheme.
Cite as: Hasan, T., Hansen, J.H.L. (2012) Factor analysis of acoustic features using a mixture of probabilistic principal component analyzers for robust speaker verification. Proc. The Speaker and Language Recognition Workshop (Odyssey 2012), 243-247
@inproceedings{hasan12_odyssey, author={Taufiq Hasan and John H. L. Hansen}, title={{Factor analysis of acoustic features using a mixture of probabilistic principal component analyzers for robust speaker verification}}, year=2012, booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2012)}, pages={243--247} }