8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

A Comparison of Normalization and Training Approaches for ASR-Dependent Speaker Identification

Alex Park, Timothy J. Hazen


In this paper we discuss a speaker identification approach, called ASR-dependent speaker identification, that incorporates phonetic knowledge into the models for each speaker. This approach differs from traditional methods for performing text-independent speaker identification, such as global Gaussian mixture modeling, that typically ignore the phonetic content of the speech signal. We introduce a new score normalization approach, called phone adaptive normalization, which improves upon our previous speaker adaptive normalization technique. This paper also examines the use of automatically generated transcriptions during the training of our speaker models. Experiments show that speaker models trained using automatically generated transcriptions achieve the same performance as models trained using manually generated transcriptions.

Full Paper

Bibliographic reference.  Park, Alex / Hazen, Timothy J. (2004): "A comparison of normalization and training approaches for ASR-dependent speaker identification", In INTERSPEECH-2004, 2601-2604.