Sixth International Conference on Spoken Language Processing
The paper considers text independent speaker identification over the telephone using short training and testing data. Gaussian Mixture Modeling (GMM) is used in the testing phase, but the parameters of the model are taken from clusters obtained for the training data by an adequate choice of feature vectors and a distance measure without optimization in the maximum likelihood (ML) sense. This distance-based GMM (DB-GMM) approach was evaluated by experiments in speaker identification from short telephone-speech data for a few feature vectors and distance measures. The selected feature vectors were Line Spectra Pairs (LSP) and Mel Frequency Cepstra (MFC). The selected distance measures were weighted Euclidean distance with IHM and BPL, respectively. DB-GMM showed consistently better performance than GMM trained by the expectationmaximization (EM) algorithm. Another notable observation is that a full covariance GMM (that is more comfortably trained by DB-GMM) always achieved significantly better performance than diagonal covariance GMM.
[PDF file damaged on original CD. W.H.]
Bibliographic reference. Zilca, Ran D. / Bistritz, Yuval (2000): "Distance-based Gaussian mixture model for speaker recognition over the telephone", In ICSLP-2000, vol.3, 1001-1004.