Odyssey 2008: The Speaker and Language Recognition Workshop
Stellenbosch, South Africa
Test normalization (T-Norm) is a score normalization technique that is regularly and successfully applied in the context of text-independent speaker recognition. It is less frequently applied, however, to text-dependent or textprompted speaker recognition, mainly because its improvement in this context is more modest. In this paper we present a novel way to improve the performance of T-Norm for text-dependent systems. It consists in applying score TNormalization at the phoneme or sub-phoneme level instead of at the sentence level. Experiments on the YOHO corpus show that, while using standard sentence-level T-Norm does not improve equal error rate (EER), phoneme and sub-phoneme level T-Norm produce a relative EER reduction of 18.9% and 20.1% respectively on a state-of-the-art HMM based textdependent speaker recognition system. Results are even better for working points with low false acceptance rates.
Full Paper Presentation (PPT)
Bibliographic reference. Toledano, Doroteo T. / Esteve-Elizalde, Cristina / Gonzalez-Rodriguez, Joaquin / Pozo, Ruben Fernandez / Gomez, Luis Hernandez (2008): "Phoneme and sub-phoneme t-normalization for text-dependent speaker recognition", In Odyssey-2008, paper 029.