Sixth International Conference on Spoken Language Processing
Given that the amount of speaker specific training data is always limited, for a given amount of data a speaker model has an optimum number of components. Here, this is investigated with regard to Gaussian mixture models (GMM) with and without world model adaption. Test results show that maximising the number of components in a speaker model can improve speaker recognition results. Comparisons with vector quantisation (VQ) indicate that sensible use of out-of-class data is essential for optimising a recognition system.
Bibliographic reference. Stapert, Robert / Mason, John S. / Auckenthaler, Roland (2000): "Optimisation of GMM in speaker recognition", In ICSLP-2000, vol.3, 997-1000.