This article deals with a study on GMM-based voice conversion systems. We compare the main linear conversion functions found in the literature on an identical speech corpus. We insist in particular on the risks of over-fitting and over-smoothing. We propose three alternatives for robust conversion functions in order to minimize these risks. We show, on two experimental speech databases, that the approach suggested by Kain remains the more precise but leads to an over-fitting ratio of 1.72%. The alternatives which we propose, present an average degradation of 2.8% for a 0.52% over-fitting ratio.
Bibliographic reference. Mesbahi, Larbi / Barreaud, Vincent / Boeffard, Olivier (2007): "Comparing GMM-based speech transformation systems", In INTERSPEECH-2007, 1989-1992.