The aim of this paper is to analyze the pairwise comparisons of voices by a speaker verification system (ALIZE/Spk) and by human. A database of familial groups of 24 speakers was created. A single sentence was chosen for the perception test. The same sentence was used the test signal for the ALIZE/Spk trained on another part of the corpus. Results shows that the voice proximities within a familial group were well recovered in the speaker representation by ALIZE and much less returned in the representation from perception test.
Cite as: Kahn, J., Rossato, S. (2009) Do humans and speaker verification system use the same information to differentiate voices? Proc. Interspeech 2009, 2375-2378, doi: 10.21437/Interspeech.2009-402
@inproceedings{kahn09_interspeech, author={Juliette Kahn and Solange Rossato}, title={{Do humans and speaker verification system use the same information to differentiate voices?}}, year=2009, booktitle={Proc. Interspeech 2009}, pages={2375--2378}, doi={10.21437/Interspeech.2009-402} }