In this paper, we propose a new score normalization method for text-independent speaker verification using GMM (Gaussian Mixture Model). In the proposed method, cohort model is designed as virtual speaker model based on the similarity of local acoustic information between the reference speaker and other customers. The similarity is determined using statistical distance between model components such as the Gaussian distributions. Therefore, synthesized cohort model is statistically close to the reference speaker model, and can provide an effective normalizing score for various observed measurements. The experimental results using telephone speech of 60 speakers showed that the proposed method is superior to the typical methods with cohort speaker model or pooled model. Equal Error Rate (EER) when using common posteriori-defined threshold value for every speakers was drastically reduced from 3.82 % (for the conventional normalization with cohort speaker model) or 10.3 % (for normalization with pooled model) to 2.50 % (for the proposed method) when cohort size is equal to three.
Cite as: Isobe, T., Takahashi, J.-i. (1999) Text-independent speaker verification using virtual speaker based cohort normalization. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 987-990, doi: 10.21437/Eurospeech.1999-241
@inproceedings{isobe99_eurospeech, author={Toshihiro Isobe and Jun-ichi Takahashi}, title={{Text-independent speaker verification using virtual speaker based cohort normalization}}, year=1999, booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)}, pages={987--990}, doi={10.21437/Eurospeech.1999-241} }