ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification

Martigny, Switzerland
April 7-9, 1994

Speaker Identification and Verification using Gaussian Mixture Speaker Models

Douglas A. Reynolds

MIT Lincoln Laboratory, Speech Systems Technology Group, Lexington, MA, USA

This paper presents high performance speaker identification and verification systems based on Gaussian mixture speaker models: robust, statistically based representations of speaker identity. The focus domain is for unconstrained speech, although the systems can equally be used for text-dependent tasks. The identification system is a maximum likelihood classifier and the verification system is a likelihood ratio hypothesis tester using background speaker normalisation.

The systems are evaluated on three widely used speech databases: TIMIT, NITWIT and Switchboard. The different levels of degradations and variabilities found in these databases allow the examination of system results for different task domains. An identification accuracy of 99.7% was obtained for a 168 population on TIMIT, 76.2% for NTIMIT and 82.8% for a 113 population on Switchboard. Global threshold equal error rates of 0.3%, 5.4% and 7.0% were obtained in verification experiments on TIMIT, NTIMIT and Switchboard, respectively.

Full Paper

Bibliographic reference.  Reynolds, Douglas A. (1994): "Speaker identification and verification using Gaussian mixture speaker models", In ASRIV-1994, 27-30.