8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Dimension Reduction for Speaker Identification Based on Mutual Information

Xugang Lu, Jianwu Dang

JAIST, Japan

Dimension reduction is a necessary step for speech feature extraction in a speaker identification system. Discrete Cosine Transform (DCT) or Principal Component Analysis (PCA) is widely used for dimension reduction. By choosing basis vectors from basis vector pool of DCT or PCA which contribute more to data distribution variance or reconstruction accuracy of speech data set, we can transform the data set by projecting them on to the selected basis vectors. However, keeping the maximum distribution variance or high reconstruction accuracy does not guarantee the optimal keeping of high speaker discriminative information. In this paper, we proposed a basis vector selection method based on mutual information concept which guarantees the keeping of high speaker discriminative information. The mutual information is used to measure the dependency between the features extracted using basis vectors and speaker class labels. The high mutual information related basis vectors are chosen for feature extraction. Considering one speaker feature may be encoded in more than one basis vectors, we proposed to use joint mutual information concept which takes the dependency between feature variables into consideration. Based on the selected basis vectors from DCT or PCA basis vector pool, we extracted features for speaker identification experiments. Experimental results showed that the speaker identification error rate using proposed feature was reduced 11% and 8% on average for DCT and PCA based features respectively.

Full Paper

Bibliographic reference.  Lu, Xugang / Dang, Jianwu (2007): "Dimension reduction for speaker identification based on mutual information", In INTERSPEECH-2007, 2021-2024.