Factor Analysis Based Speaker Verification Using ASR

Hang Su, Steven Wegmann


In this paper, we propose to improve speaker verification performance by importing better posterior statistics from acoustic models trained for Automatic Speech Recognition (ASR). This approach aims to introduce state-of-the-art techniques in ASR to speaker verification task. We compare statistics collected from several ASR systems, and show that those collected from deep neural networks (DNN) trained with fMLLR features can effectively reduce equal error rate (EER) by more than 30% on NIST SRE 2010 task, compared with those DNN trained without feature transformations. We also present derivation of factor analysis using variational Bayes inference, and illustrate implementation details of factor analysis and probabilistic linear discriminant analysis (PLDA) in Kaldi recognition toolkit.


DOI: 10.21437/Interspeech.2016-1157

Cite as

Su, H., Wegmann, S. (2016) Factor Analysis Based Speaker Verification Using ASR. Proc. Interspeech 2016, 2223-2227.

Bibtex
@inproceedings{Su+2016,
author={Hang Su and Steven Wegmann},
title={Factor Analysis Based Speaker Verification Using ASR},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1157},
url={http://dx.doi.org/10.21437/Interspeech.2016-1157},
pages={2223--2227}
}