ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Acoustic factor analysis based universal background model for robust speaker verification in noise

Taufiq Hasan, John H. L. Hansen

The Universal Background Model (UBM) is known as a speaker independent Gaussian Mixture Model (GMM) trained on a large speech corpus containing many speakers' recordings in various conditions. When noisy test data is involved, UBM trained on clean data is generally not optimal. Using noisy data for UBM training, however, creates a bias towards the specific development noise samples resulting in degraded speaker recognition performance in unseen noise types. In this study, we utilize an Acoustic Factor Analysis (AFA) based UBM that iteratively learns the dominant feature sub-spaces in each mixture component, resulting in a more robust model. We explore two variants of the model: one with an isotropic and the other with a diagonal residual noise. The Maximum-Likelihood (ML) training formulations of the models are provided. The latent variables of the model, termed acoustic factors, are used as features to train the second stage of factor analysis parameters, i.e., the traditional i-vector extractor. Experiments performed on the 2012 National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE) indicate the effectiveness of the proposed strategy in both clean and noisy conditions.


doi: 10.21437/Interspeech.2013-681

Cite as: Hasan, T., Hansen, J.H.L. (2013) Acoustic factor analysis based universal background model for robust speaker verification in noise. Proc. Interspeech 2013, 3127-3131, doi: 10.21437/Interspeech.2013-681

@inproceedings{hasan13_interspeech,
  author={Taufiq Hasan and John H. L. Hansen},
  title={{Acoustic factor analysis based universal background model for robust speaker verification in noise}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={3127--3131},
  doi={10.21437/Interspeech.2013-681}
}