INTERSPEECH 2014
15th Annual Conference of the International Speech Communication Association

Singapore
September 14-18, 2014

Using Deep Belief Networks for Vector-Based Speaker Recognition

W. M. Campbell

MIT Lincoln Laboratory, USA

Deep belief networks (DBNs) have become a successful approach for acoustic modeling in speech recognition. DBNs exhibit strong approximation properties, improved performance, and are parameter efficient. In this work, we propose methods for applying DBNs to speaker recognition. In contrast to prior work, our approach to DBNs for speaker recognition starts at the acoustic modeling layer. We use sparse-output DBNs trained with both unsupervised and supervised methods to generate statistics for use in standard vector-based speaker recognition methods. We show that a DBN can replace a GMM UBM in this processing. Methods, qualitative analysis, and results are given on a NIST SRE 2012 task. Overall, our results show that DBNs show competitive performance to modern approaches in an initial implementation of our framework.

Full Paper

Bibliographic reference.  Campbell, W. M. (2014): "Using deep belief networks for vector-based speaker recognition", In INTERSPEECH-2014, 676-680.