13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

A Bayesian Approach to Speaker Recognition Based on GMMs Using Multiple Model Structures

Takafumi Hattori, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda

Department of Computer Science and Engineering, Nagoya Institute of Technology, Nagoya, Japan

This paper proposes a speaker recognition technique using multiple model structures based on the Bayesian approach. In recent speaker recognition, many sophisticated statistical models have been proposed, e.g., Joint Factor Analysis and i-Vector based method. However, since most of them are based on Gaussian Mixture Models (GMMs), therefore improving estimation accuracy of generative models, i.e. GMMs, with limited amount of training data is still an important problem in speaker recognition. For this purpose, a Bayesian approach which marginalizes all possible model parameters has been applied to the GMM based speaker recognition. This paper extends it to the model structure marginalization. The proposed method can improve the estimation accuracy by integrating multiple GMMs with different numbers of mixtures within the Bayesian framework. Experimental results show that the proposed method improved the identification rates from the conventional method using a single model structure.

Index Terms: speaker recognition, GMM, Bayesian approach, model structure

Full Paper

Bibliographic reference.  Hattori, Takafumi / Hashimoto, Kei / Nankaku, Yoshihiko / Tokuda, Keiichi (2012): "A Bayesian approach to speaker recognition based on GMMs using multiple model structures", In INTERSPEECH-2012, 1107-1110.