Odyssey 2012 - The Speaker and Language Recognition Workshop

Singapore
June 25-28, 2012

Source Normalization for Language-Independent Speaker Recognition using i-Vectors

Mitchell McLaren, Miranti Indar Mandasari, David A. van Leeuwen

Centre for Language and Speech Technology, Radboud University Nijmegen, The Netherlands

Source-normalization (SN) is an effective means of improving the robustness of i-vector-based speaker recognition for under-resourced and unseen cross-speech-source evaluation conditions. The technique of source-normalization estimates directions of undesired within-speaker variation more accurately than traditional methods when cross-source variation is not explicitly observed from each speaker in system development data. Source normalization can be incorporated into Within Class Covariance Normalization (WCCN) as an effective preprocessing step to Probabilistic Linear Discriminant Analysis (PLDA) based speaker recognition with i-vectors.

This paper proposes to extend the application of sourcenormalization to the reduction of language-dependence in PLDA speaker recognition by normalising for the variation that separates languages. Evaluated on the NIST 2008 and 2010 speaker recognition evaluation (SRE) data sets, the proposed Language Normalized WCCN (LN-WCCN) provides relative improvements of 26% in minimum DCF and 14% in EER under multilingual scenarios without detriment to common Englishonly conditions. LN-WCCN is also shown to significantly improve calibration performance when calibration parameters are learned from scores mismatched to evaluation conditions.

Full Paper

Bibliographic reference.  McLaren, Mitchell / Mandasari, Miranti Indar / Leeuwen, David A. van (2012): "Source normalization for language-independent speaker recognition using i-vectors", In Odyssey-2012, 55-61.