8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Model Compression for GMM Based Speaker Recognition Systems

Douglas A. Reynolds

Massachusetts Institute of Technology, USA

For large-scale deployments of speaker verification systems models size can be an important issue for not only minimizing storage requirements but also reducing transfer time of models over networks. Model size is also critical for deployments to small, portable devices. In this paper we present a new model compression technique for Gaussian Mixture Model (GMM) based speaker recognition systems. For GMM systems using adaptation from a background model, the compression technique exploits the fact that speaker models are adapted from a single speaker-independent model and not all parameters need to be stored. We present results on the 2002 NIST speaker recognition evaluation cellular telephone corpus and show that the compression technique provides a good tradeoff of compression ratio to performance loss. We are able to achieve a 56:1 compression (624KB arrow 11KB) with only a 3.2% relative increase in EER (9.1% arrow 9.4%).

Full Paper

Bibliographic reference.  Reynolds, Douglas A. (2003): "Model compression for GMM based speaker recognition systems", In EUROSPEECH-2003, 2005-2008.