16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Sparse Coding of Total Variability Matrix

Longting Xu (1), Kong Aik Lee (2), Haizhou Li (2), Zhen Yang (1)

(1) NJUPT, China
(2) A*STAR, Singapore

In text-independent speaker verification, it has been shown effective to represent the variable-length and information rich speech utterances using fixed-dimensional vectors, for instance, in the form of i-vectors. An i-vector is a low-dimensional vector in the so-called total variability space represented with a thin and tall rectangular matrix. Taking each row of the total variability matrix as a random vector, we look into the redundancy in representing the total variability space. We show that the total variability matrix is compressible and such characteristic could be exploited to reduce the memory and computational requirement in i-vector extraction. We also show that the existing sparse coding and dictionary learning techniques could be easily adapted for this purpose. Experiments on NIST SRE'10 dataset confirm that the total variability matrix could be represented with a smaller matrix without affecting the performance.

Full Paper

Bibliographic reference.  Xu, Longting / Lee, Kong Aik / Li, Haizhou / Yang, Zhen (2015): "Sparse coding of total variability matrix", In INTERSPEECH-2015, 1022-1026.