11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Incorporating MAP Estimation and Covariance Transform for SVM Based Speaker Recognition

Cheung-Chi Leung, Donglai Zhu, Kong Aik Lee, Bin Ma, Haizhou Li

A*STAR, Singapore

In this paper, we apply Constrained Maximum a Posteriori Linear Regression (CMAPLR) transformation on Universal Background Model (UBM) when characterizing each speaker with a supervector. We incorporate the covariance transformation parameters into the supervector in addition to the mean transformation parameters. Maximum Likelihood Linear Regression (MLLR) covariance transformation is adopted. The auxiliary function maximization involved in Maximum Likelihood (ML) and Maximum a Posteriori (MAP) estimation is also presented. Our experiment on the 2006 NIST Speaker Recognition Evaluation (SRE) corpus shows that the two proposed techniques provide substantial performance improvement.

Full Paper

Bibliographic reference.  Leung, Cheung-Chi / Zhu, Donglai / Lee, Kong Aik / Ma, Bin / Li, Haizhou (2010): "Incorporating MAP estimation and covariance transform for SVM based speaker recognition", In INTERSPEECH-2010, 2318-2321.