13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Bayesian Mixture of Probabilistic Linear Regressions for Voice Conversion

Na Li (1,2), Yu Qiao (1,3)

(1) Shenzhen key lab of CVPR, Shenzhen Institutes of Advanced Technology, CAS, China
(2) Northwestern Polytechnical University, Xi'an, China
(3) The Chinese University of Hong Kong, Hong Kong, China

The objective of voice conversion is to transform the voice of one speaker to make it sound like another. The GMM-based statistical mapping technique has been proved to be an efficient method for converting voices. We generalized this technique to Mixture of Probabilistic Liner Regressions (MPLR) by using general mixture model of source vectors. In this paper, we improve MPLR by considering a prior for the transformation parameters of liner regressions, which leads to Bayesian Mixture of Probabilistic Liner Regressions (BMPLR). BMPLR has the effectiveness and robustness of Bayesian inference. Especially when the number of training data is limited and the mixture number is larger, BMPLR can largely relieve the overfitting problem. This paper presents two formulations for BMPLR, depending on how to model noise in probabilistic regression function. In addition, we derive equations for MAP estimation of transformation parameters. We examine the proposed method on voice conversion of Japanese utterances. The experimental results exhibit that BMPLR achieves better performance than MPLR.

Index Terms: Bayesian linear regression, mixture of proba- bilistic regressions, voice conversion,

Full Paper

Bibliographic reference.  Li, Na / Qiao, Yu (2012): "Bayesian mixture of probabilistic linear regressions for voice conversion", In INTERSPEECH-2012, 82-85.