12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

A Bayesian Approach to Voice Conversion Based on GMMs Using Multiple Model Structures

Lei Li, Yoshihiko Nankaku, Keiichi Tokuda

Nagoya Institute of Technology, Japan

A spectral conversion method using multiple Gaussian Mixture Models (GMMs) based on the Bayesian framework is proposed. A typical spectral conversion framework is based on a GMM. However, in this conventional method, a GMM-appropriate number of mixtures is dependent on the amount of training data, and thus the number of mixtures should be determined beforehand. In the proposed method, the variational Bayesian approach is applied to GMM-based voice conversion, and multiple GMMs are integrated as a single statistical model. Appropriate model structures are stochastically selected for each frame based on the Bayesian frame work.

Full Paper

Bibliographic reference.  Li, Lei / Nankaku, Yoshihiko / Tokuda, Keiichi (2011): "A Bayesian approach to voice conversion based on GMMs using multiple model structures", In INTERSPEECH-2011, 661-664.