The purpose of this paper is to study the behavior of voice conversion systems based on Gaussian mixture model (GMM) when reducing the size of the training data corpus. Our first objective is to locate the threshold of degradation on the training corpus from which the error of conversion becomes too important. Secondly, we seek to observe the behavior of these conversion systems with regard to this threshold, in order to establish a relation between the size of training data corpus and the complexity of each method of transformation. We observed that the threshold is beyond 50 sentences (ARCTIC corpus), whatever the conversion system. For this corpus, the conversion error of the best approach increases only by 1.77 % compared to the complete training corpus which contains 210 utterances.
Cite as: Mesbahi, L., Barreaud, V., Boeffard, O. (2007) GMM-based speech transformation systems under data reduction. Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6), 119-124
@inproceedings{mesbahi07_ssw, author={Larbi Mesbahi and Vincent Barreaud and Olivier Boeffard}, title={{GMM-based speech transformation systems under data reduction}}, year=2007, booktitle={Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6)}, pages={119--124} }