ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

MLLR transforms as features in speaker recognition

Andreas Stolcke, Luciana Ferrer, Sachin Kajarekar, Elizabeth Shriberg, Anand Venkataraman

We explore the use of adaptation transforms employed in speech recognition systems as features for speaker recognition. This approach is attractive because, unlike standard frame-based cepstral speaker recognition models, it normalizes for the choice of spoken words in text-independent speaker verification. Affine transforms are computed for the Gaussian means of the acoustic models used in a recognizer, using maximum likelihood linear regression (MLLR). The high-dimensional vectors formed by the transform coefficients are then modeled as speaker features using support vector machines (SVMs). The resulting speaker verification system is competitive, and in some cases significantly more accurate, than state-of-the-art cepstral gaussian mixture and SVM systems. Further improvements are obtained by combining baseline and MLLR-based systems.

doi: 10.21437/Interspeech.2005-647

Cite as: Stolcke, A., Ferrer, L., Kajarekar, S., Shriberg, E., Venkataraman, A. (2005) MLLR transforms as features in speaker recognition. Proc. Interspeech 2005, 2425-2428, doi: 10.21437/Interspeech.2005-647

  author={Andreas Stolcke and Luciana Ferrer and Sachin Kajarekar and Elizabeth Shriberg and Anand Venkataraman},
  title={{MLLR transforms as features in speaker recognition}},
  booktitle={Proc. Interspeech 2005},