ISCA Archive Odyssey 2010
ISCA Archive Odyssey 2010

Intra-speaker variability effects on Speaker Verification performance

Juliette Kahn, Nicolas Audibert, Solange Rossato, Jean-Fran├žois Bonastre

Speaker verification systems have shown significant progress and have reached a level of performance that make their use in practical applications possible. Nevertheless, large differences in terms of performance are observed, depending on the speaker or the speech excerpt used. This context emphasizes the importance of a deeper analysis of the system's performance over average error rate. In this paper, the effect of the training excerpt is investigated using ALIZE/SpkDet on two different corpora: NIST-SRE 08 (conversational speech) and BREF 120 (controlled read speech). The results show that the SVS performance are highly dependent on the voice samples used to train the speaker model: the overall Equal Error Rate (EER) ranges from 4.1% to 29.1% on NIST-SRE 08 and from 1.0% to 33.0% on BREF 120. The hypothesis that such performance differences are explained by phonetic contents of voice samples is studied on BREF 120.


Cite as: Kahn, J., Audibert, N., Rossato, S., Bonastre, J.-F. (2010) Intra-speaker variability effects on Speaker Verification performance. Proc. The Speaker and Language Recognition Workshop (Odyssey 2010), paper 21

@inproceedings{kahn10_odyssey,
  author={Juliette Kahn and Nicolas Audibert and Solange Rossato and Jean-Fran├žois Bonastre},
  title={{Intra-speaker variability effects on Speaker Verification performance}},
  year=2010,
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2010)},
  pages={paper 21}
}