So far, cross-language voice conversion requires at least one bilingual speaker and parallel speech data to perform the training. This paper shows how these obstacles can be overcome by means of a recently presented text-independent training method based on unit selection. The new method is evaluated in the framework of the European speech-to-speech translation project TC-Star and achieves a performance similar to that of text-dependent intra-lingual voice conversion.
Cite as: Sündermann, D., Höge, H., Bonafonte, A., Ney, H., Hirschberg, J. (2006) Text-independent cross-language voice conversion. Proc. Interspeech 2006, paper 1665-Thu1BuP.4, doi: 10.21437/Interspeech.2006-581
@inproceedings{sundermann06_interspeech, author={David Sündermann and Harald Höge and Antonio Bonafonte and Hermann Ney and Julia Hirschberg}, title={{Text-independent cross-language voice conversion}}, year=2006, booktitle={Proc. Interspeech 2006}, pages={paper 1665-Thu1BuP.4}, doi={10.21437/Interspeech.2006-581} }