In state-of-the-art speaker recognition system, universal background model (UBM) plays a role of acoustic space division. Each Gaussian mixture of trained UBM represents one distinct acoustic region. The posterior probabilities of features belonging to each region are further used as core components of Baum-Welch statistics. Therefore, the quality of estimated Baum-Welch statistics depends highly on how acoustic regions are separable with each other. In this paper, we propose to transform the front end acoustical features into a space where the separability of mixtures of trained UBM can be optimized. To achieve this, an UBM was first trained from the acoustical features and a transformation matrix is estimated using linear discriminant analysis (LDA) by treating each mixture of trained UBM as independent class. Therefore, the proposed method named as UBM-based LDA (uLDA) does not require any speaker labels or other supervised information. The obtained transformation matrix is then applied to acoustic features for i-Vector extraction. Experimental results on the male part of core conditions of NIST SRE 2010 dataset confirmed the improved performance using proposed method.
Bibliographic reference. Yu, Chengzhu / Liu, Gang / Hansen, John H. L. (2014): "Acoustic feature transformation using UBM-based LDA for speaker recognition", In INTERSPEECH-2014, 1851-1854.