9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Speaker Recognition Based on Variational Bayesian Method

Tatsuya Ito, Kei Hashimoto, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda

Nagoya Institute of Technology, Japan

This paper presents a speaker identification system based on Gaussian Mixture Models (GMM) using the variational Bayesian method. Maximum Likelihood (ML) and Maximum A Posterior (MAP) are well-known methods for estimating GMM parameters. However, the overtraining problem occurs with insufficient data due to a point estimate of model parameters. The Bayesian approach estimates a posterior distribution of model parameters and achieves a more robust prediction than ML and MAP approach. To solve complicated integral calculations in the Bayesian approach, the variational Bayesian method has been proposed and applied to many classification problems using latent variable models. However, the performance of the Bayesian approach has not been extensively investigated in large speaker identification tasks. The experimental results shows that the VB method improves the overtraining problem than the conventional ML and MAP methods.

Full Paper

Bibliographic reference.  Ito, Tatsuya / Hashimoto, Kei / Nankaku, Yoshihiko / Lee, Akinobu / Tokuda, Keiichi (2008): "Speaker recognition based on variational Bayesian method", In INTERSPEECH-2008, 1417-1420.