Conventional approaches to voice conversion typically use a GMM to represent the joint probability density of source and target features. This model is then used to perform spectral conversion between speakers. This approach is reasonably effective but can be prone to overfitting and oversmoothing of the target spectra. This paper proposes an alternative scheme that uses a collection of Gaussian process experts to perform the spectral conversion. Gaussian processes are robust to overfitting and oversmoothing and can predict the target spectra more accurately. Experimental results indicate that the objective performance of voice conversion can be improved using the proposed approach.
Bibliographic reference. Pilkington, Nicholas C. V. / Zen, Heiga / Gales, M. J. F. (2011): "Gaussian process experts for voice conversion", In INTERSPEECH-2011, 2761-2764.