Odyssey 2008: The Speaker and Language Recognition Workshop

Stellenbosch, South Africa
January 21-24, 2008

Comparisons of Recent Speaker Recognition Approaches based on Word-Conditioning

Howard Lei (1,2), Nikki Mirghafori (2)

(1) The International Computer Science Institute; (2) The University of California, Berkeley, CA, USA

We examine the effectiveness of various speaker recognition approaches based on word-conditioning. Subsets of 62 keywords (used for word-conditioning) are examined for their individual and combined effectiveness for a keyword HMM approach, a supervector keyword HMM approach, a keyword phone Ngrams approach, and a keyword phone HMM approach. Our results demonstrate the effectiveness of acoustic features and importance of keyword frequency in individual keyword results, where the keywords yeah and you know outperform others. We also demonstrate the power of SVMs, in conjunction with acoustic features, in keyword combination experiments, in which the supervector keyword HMM approach (4.3% EER) outperforms other keyword-based approaches, and achieves a 6.5% improvement over the GMM baseline (4.6% EER) on the SRE06 8-conversation-side task.

Full Paper

Bibliographic reference.  Lei, Howard / Mirghafori, Nikki (2008): "Comparisons of recent speaker recognition approaches based on word-conditioning", In Odyssey-2008, paper 028.