7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

ASR Dependent Techniques for Speaker Identification

Alex Park, Timothy J. Hazen

MIT Laboratory for Computer Science, USA

Traditional text independent speaker recognition systems are based on Gaussian Mixture Models (GMMs) trained globally over all speech from a given speaker. In this paper, we describe alternative methods for performing speaker identification that utilize domain dependent automatic speech recognition (ASR) to provide a phonetic segmentation of the test utterance. When evaluated on YOHO, several of these approaches were able outperform previously published results on the speaker ID task. On a more difficult conversational speech task, we were able to use a combination of classifiers to reduce identification error rates on single test utterances. Over multiple utterances, the ASR dependent approaches performed significantly better than the ASR independent methods. Using an approach we call speaker adaptive modeling for speaker identification, we were able to reduce speaker identification error rates by 39% over a baseline GMM approach when observing five test utterances from a speaker.

Full Paper

Bibliographic reference.  Park, Alex / Hazen, Timothy J. (2002): "ASR dependent techniques for speaker identification", In ICSLP-2002, 1337-1340.