INTERSPEECH 2004 - ICSLP
8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Cluster-Dependent Modeling and Confidence Measure Processing for In-Set/Out-of-Set Speaker Identification

Pongtep Angkititrakul, Sepideh Baghaii, John H. L. Hansen

University of Colorado at Boulder, USA

In this paper, we propose an approach to address the problem of text-independent open-set speaker identification. The in-set speakers are clustered into smaller subsets without merging speaker models. The Anti-Speaker or Background Model is then adapted for each subset which minimizes the identification errors of the pseudo impostors during the training stage. Score normalization is applied to align all the in-set speaker score distributions to share a single scale. Finally, confidence measure processing is used to identify in-set versus out-of-set speakers. Experiments with TIMIT and the CU-Accent corpora show an improvement in Equal Error Rate on the average of 20.28% and 8.35% over the baseline performance respectively. Finally, a probe experiment is also included that considers prosody for in-set speaker detection.

Full Paper

Bibliographic reference.  Angkititrakul, Pongtep / Baghaii, Sepideh / Hansen, John H. L. (2004): "Cluster-dependent modeling and confidence measure processing for in-set/out-of-set speaker identification", In INTERSPEECH-2004, 2385-2388.