Odyssey 2012 - The Speaker and Language Recognition Workshop
We propose a novel generative approach to speaker recognition using Boltzmann machines, a fledgeling non-Gaussian probabilistic framework that is increasingly gaining attention in several machine learning fields. We show how a modified i-vector representation of speech utterances enables the development of several Boltzmann machine architectures for speaker verification and we report some preliminary speaker recognition results obtained with one of them, which we refer to as Siamese twins. The Siamese twin architecture is designed to capture correlations between utterances spoken by a single speaker and it can be regarded as probabilistic analogue of the well known cosine distance metric. A relative improvement of 27% is reported on NIST-2010 telephone female data.
Bibliographic reference. Stafylakis, Themos / Kenny, Patrick / Senoussaoui, Mohammed / Dumouchel, Pierre (2012): "Preliminary investigation of Boltzmann machine classifiers for speaker recognition", In Odyssey-2012, 109-116.