11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Bayesian Speaker Recognition Using Gaussian Mixture Model and Laplace Approximation

Shih-Sian Cheng, I-Fan Chen, Hsin-Min Wang

Academia Sinica, Taiwan

This paper presents a Bayesian approach for Gaussian mixture model (GMM)-based speaker identification. Instead of evaluating the speaker score of a test speech utterance using a single data likelihood over the GMM learned by the point estimation methods according to the maximum likelihood or maximum a posteriori criteria, the Bayesian approach evaluates the score using the expectation of the data likelihood over the posterior distribution of the model parameters, which is depicted with Bayesian integration. However, the integration can not be derived analytically. Therefore, we apply Laplace approximation to the derivations. Theoretically, we show that the proposed Bayesian approach is equivalent to the GMM-UBM approach when infinite training data is available for each speaker. The results of speaker identification experiments on the TIMIT corpus show that the proposed Bayesian approach consistently outperforms GMM-UBM under very limited training data conditions.

Full Paper

Bibliographic reference.  Cheng, Shih-Sian / Chen, I-Fan / Wang, Hsin-Min (2010): "Bayesian speaker recognition using Gaussian mixture model and laplace approximation", In INTERSPEECH-2010, 2730-2733.