12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Gaussian Process Experts for Voice Conversion

Nicholas C. V. Pilkington, Heiga Zen, M. J. F. Gales

Toshiba Research Europe Ltd., UK

Conventional approaches to voice conversion typically use a GMM to represent the joint probability density of source and target features. This model is then used to perform spectral conversion between speakers. This approach is reasonably effective but can be prone to overfitting and oversmoothing of the target spectra. This paper proposes an alternative scheme that uses a collection of Gaussian process experts to perform the spectral conversion. Gaussian processes are robust to overfitting and oversmoothing and can predict the target spectra more accurately. Experimental results indicate that the objective performance of voice conversion can be improved using the proposed approach.

Full Paper

Bibliographic reference.  Pilkington, Nicholas C. V. / Zen, Heiga / Gales, M. J. F. (2011): "Gaussian process experts for voice conversion", In INTERSPEECH-2011, 2761-2764.