This paper presents the data selection method for speaker recognition. Since there is no promise that more data guarantee better results, the way of data selection becomes important. In the GMM-UBM speaker recognition, the UBM is trained to represent the speaker-independent distribution of acoustic features while the GMM speaker model is tailored for a specific speaker. In this study of data selection for speaker recognition, we apply the maximum entropy criterion to remove the redundant feature frames in the UBM training and to select the discriminative feature frames in the GMM speaker modeling. The conducted experiments on the 2008 NIST Speaker Recognition Evaluation corpus show that the proposed method outperforms the baseline system without the data selection.
Bibliographic reference. Huang, Chien-Lin / Ma, Bin (2011): "Maximum entropy based data selection for speaker recognition", In INTERSPEECH-2011, 2713-2716.