We propose a strategy for discriminative training of the i-vector extractor in speaker recognition. The original i-vector extractor training was based on the maximum-likelihood generative modeling, where the EM algorithm was used. In our approach, the i-vector extractor parameters are numerically optimized to minimize the discriminative cross-entropy error function. Two versions of the i-vector extraction are studied - the original approach as defined for Joint Factor Analysis, and the simplified version, where orthogonalization of the i-vector extractor matrix is performed.
Bibliographic reference. Glembek, Ondřej / Burget, Lukáš / Brümmer, Niko / Plchot, Oldřich / Matějka, Pavel (2011): "Discriminatively trained i-vector extractor for speaker verification", In INTERSPEECH-2011, 137-140.