Interspeech'2005 - Eurospeech
In this paper, we investigate the use of a distance between Gaussian mixture models for speaker detection. The proposed distance is derived from the KL divergence and is defined as a Euclidean distance in a particular model space. This distance is simply computable directly from the model parameters thus leading to a very efficient scoring process. This new framework for scoring is compared to the classical log likelihood ratio score approach on a speaker verification task of the NIST 2004 evaluation and on the speaker tracking task of the ESTER french evaluation. Results show that the proposed approach is competitive and leads to computation times divided by a factor of more than 3.
Bibliographic reference. Ben, Mathieu / Gravier, Guillaume / Bimbot, Frédéric (2005): "A model space framework for efficient speaker detection", In INTERSPEECH-2005, 3061-3064.