In this paper, we consider speaker supervectors as observed variables and model them with a supervector probabilistic linear discriminant analysis model (SV-PLDA). By constraining the speaker and channel variability to lie in a common low-dimensional subspace, the model parameters and verification log likelihood ratios (LLR) can be computed in this low-dimensional subspace. Unlike the standard i-vector framework, SV-PLDA does not ignore the uncertainty arising from the variable length of a speech cut (observation noise). Moreover, the SV-PLDA model can be equivalently formulated in terms of an intermediate low-dimensional representation denoted as projected i-vectors (Î-vectors). This intermediate representation facilitates the use of techniques that are important in practice such as length normalization and multi-cut enrollment averaging. We validate the proposed model on a subset of the NIST extended-SRE12 telephone dataset for which test segments of nominal durations of 300, 100, and 30 seconds are available. We show significant improvements over the standard i-vector system for the short-duration test cuts and also compare SV-PLDA with recently proposed extensions of the i-vector framework that also include the observation noise.
Bibliographic reference. Garcia-Romero, Daniel / McCree, Alan (2013): "Subspace-constrained supervector PLDA for speaker verification", In INTERSPEECH-2013, 2479-2483.