We report the results of some speaker verification experiments on the NIST 1999 and NIST 2000 test sets using factor analysis likelihood ratio statistics. For the experiments on the 1999 test set we had to use a mismatched training set, namely Phases 1 and 2 of the Switchboard II corpus, to train the factor analysis model. Our results on this test set are are comparable to (but not better than) the best results that have been attained with standard methods (GMM likelihood ratios and handset detection). In order to experiment with well matched training and test sets, we used half of the target speakers in the NIST 2000 evaluation for testing and a disjoint set of speakers taken from Switchboard II, Phases 1 and 2 for training. In this situation we obtained an equal error rate of 7.2% and a minimum detection cost of 0.028. These figures represent an improvement of about 25% over standard methods.
Cite as: Kenny, P., Dumouchel, P. (2004) Experiments in speaker verification using factor analysis likelihood ratios. Proc. The Speaker and Language Recognition Workshop (Odyssey 2004), 219-226
@inproceedings{kenny04_odyssey, author={Patrick Kenny and Pierre Dumouchel}, title={{Experiments in speaker verification using factor analysis likelihood ratios}}, year=2004, booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2004)}, pages={219--226} }