ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Improving short utterance based i-vector speaker recognition using source and utterance-duration normalization techniques

A. Kanagasundaram, D. Dean, Javier Gonzalez-Dominguez, S. Sridharan, D. Ramos, Joaquin Gonzalez-Rodriguez

A significant amount of speech is typically required for speaker verification system development and evaluation, especially in the presence of large intersession variability. This paper introduces a source and utterance-duration normalized linear discriminant analysis (SUN-LDA) approaches to compensate session variability in short-utterance i-vector speaker verification systems. Two variations of SUN-LDA are proposed where normalization techniques are used to capture source variation from both short and full-length development i-vectors, one based upon pooling (SUNLDA- pooled) and the other on concatenation (SUN-LDA-concat) across the duration and source-dependent session variation. Both the SUN-LDA-pooled and SUN-LDA-concat techniques are shown to provide improvement over traditional LDA on NIST 08 truncated 10sec-10sec evaluation conditions, with the highest improvement obtained with the SUN-LDA-concat technique achieving a relative improvement of 8% in EER for mis-matched conditions and over 3% for matched conditions over traditional LDA approaches.


doi: 10.21437/Interspeech.2013-411

Cite as: Kanagasundaram, A., Dean, D., Gonzalez-Dominguez, J., Sridharan, S., Ramos, D., Gonzalez-Rodriguez, J. (2013) Improving short utterance based i-vector speaker recognition using source and utterance-duration normalization techniques. Proc. Interspeech 2013, 2465-2469, doi: 10.21437/Interspeech.2013-411

@inproceedings{kanagasundaram13_interspeech,
  author={A. Kanagasundaram and D. Dean and Javier Gonzalez-Dominguez and S. Sridharan and D. Ramos and Joaquin Gonzalez-Rodriguez},
  title={{Improving short utterance based i-vector speaker recognition using source and utterance-duration normalization techniques}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={2465--2469},
  doi={10.21437/Interspeech.2013-411}
}