ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Frame alignment method for cross-lingual voice conversion

Daniel Erro, AsunciĆ³n Moreno

Most of the existing voice conversion methods calculate the optimal transformation function from a given set of paired acoustic vectors of the source and target speakers. The alignment of the phonetically equivalent source and target frames is problematic when the training corpus available is not parallel, although this is the most realistic situation. The alignment task is even more difficult in cross-lingual applications because the phoneme sets may be different in the involved languages. In this paper, a new iterative alignment method based on acoustic distances is proposed. The method is shown to be suitable for text-independent and cross-lingual voice conversion, and the conversion scores obtained in our evaluation experiments are not far from the performance achieved by using parallel training corpora.


doi: 10.21437/Interspeech.2007-551

Cite as: Erro, D., Moreno, A. (2007) Frame alignment method for cross-lingual voice conversion. Proc. Interspeech 2007, 1969-1972, doi: 10.21437/Interspeech.2007-551

@inproceedings{erro07b_interspeech,
  author={Daniel Erro and AsunciĆ³n Moreno},
  title={{Frame alignment method for cross-lingual voice conversion}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={1969--1972},
  doi={10.21437/Interspeech.2007-551}
}