15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

SNR-Dependent Mixture of PLDA for Noise Robust Speaker Verification

Man-Wai Mak

Hong Kong Polytechnic University, China

This paper proposes a mixture of SNR-dependent PLDA models to provide a wider coverage on the i-vector spaces so that the resulting i-vector/PLDA system can handle test utterances with a wide range of SNR. To maximise the coordination among the PLDA models, they are trained simultaneously via an EM algorithm using utterances contaminated with noise at various levels. The contribution of a training i-vector to individual PLDA models is determined by the posterior probability of the utterance's SNR. Given a test i-vector, the marginal likelihoods from individual PLDA models are linear combined based on the the posterior probabilities of the test utterance and the targetspeaker's utterance. Verification scores are the ratio of the marginal likelihoods. Results based on NIST 2012 SRE suggest that this soft-decision scheme is particularly suitable for the situations where the test utterances exhibit a wide range of SNR.

Full Paper

Bibliographic reference.  Mak, Man-Wai (2014): "SNR-dependent mixture of PLDA for noise robust speaker verification", In INTERSPEECH-2014, 1855-1859.