INTERSPEECH 2015

In textindependent speaker verification, it has been shown effective to represent the variablelength and information rich speech utterances using fixeddimensional vectors, for instance, in the form of ivectors. An ivector is a lowdimensional vector in the socalled total variability space represented with a thin and tall rectangular matrix. Taking each row of the total variability matrix as a random vector, we look into the redundancy in representing the total variability space. We show that the total variability matrix is compressible and such characteristic could be exploited to reduce the memory and computational requirement in ivector extraction. We also show that the existing sparse coding and dictionary learning techniques could be easily adapted for this purpose. Experiments on NIST SRE'10 dataset confirm that the total variability matrix could be represented with a smaller matrix without affecting the performance.
Bibliographic reference. Xu, Longting / Lee, Kong Aik / Li, Haizhou / Yang, Zhen (2015): "Sparse coding of total variability matrix", In INTERSPEECH2015, 10221026.