This work explores the speaker verification using fixed phrase short utterances. A novel speaker verification system using Gaussian posteriorgrams is proposed in which the posteriorgram vectors are computed from speaker specific Gaussian mixture model (GMM). The enrollment utterances for each of the target speakers are labeled with GMM trained on the corresponding speaker's data. The test trials are then labeled with the claimed speaker's GMM model. Dynamic time warping (DTW) is used to find a match score between the posteriorgrams of the claimed speaker and that of test trial. The proposed approach is evaluated on the fixed pass phrase subset of the recent RSR2015 database. For contrast purpose, we have also developed state-of-the-art i-vector system including probabilistic linear discriminant analysis (PLDA) classifier. The proposed framework is found to result in highly improved performance when compared with the i-vector based contrast system. We hypothesize that the cause of this large improvement lies in the use of speaker specific variances information in generation of the posteriorgram representations. On evaluating the proposed framework with non-speaker specific variances, it resulted in significant performance degradation which confirmed our hypothesis.
Bibliographic reference. Jelil, Sarfaraz / Das, Rohan Kumar / Sinha, Rohit / Prasanna, S. R. Mahadeva (2015): "Speaker verification using Gaussian posteriorgrams on fixed phrase short utterances", In INTERSPEECH-2015, 1042-1046.