ISCA Archive VCCBC 2020
ISCA Archive VCCBC 2020

Submission from SRCB for Voice Conversion Challenge 2020

Qiuyue Ma, Ruolan Liu, Xue Wen, Chunhui Lu, Xiao Chen

This paper presents the intra-lingual and cross-lingual voice conversion system for Voice Conversion Challenge 2020(VCC 2020). Voice conversion (VC) modifies a source speaker’s speech so that the result sounds like a target speaker. This becomes particularly difficult when source and target speakers speak different languages. In this work we focus on building a voice conversion system achieving consistent improvements in accent and intelligibility evaluations. Our voice conversion system is constituted by a bilingual phoneme recognition based speech representation module, a neural network based speech generation module and a neural vocoder. More concretely, we extract general phonation from the source speakers' speeches of different languages, and improve the sound quality by optimizing the speech synthesis module and adding a noise suppression post-process module to the vocoder. This framework ensures high intelligible and high natural speech, which is very close to human quality (MOS=4.17 rank 2 in Task 1, MOS=4.13 rank 2 in Task 2).


doi: 10.21437/VCCBC.2020-18

Cite as: Ma, Q., Liu, R., Wen, X., Lu, C., Chen, X. (2020) Submission from SRCB for Voice Conversion Challenge 2020. Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 131-135, doi: 10.21437/VCCBC.2020-18

@inproceedings{ma20_vccbc,
  author={Qiuyue Ma and Ruolan Liu and Xue Wen and Chunhui Lu and Xiao Chen},
  title={{Submission from SRCB for Voice Conversion Challenge 2020}},
  year=2020,
  booktitle={Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020},
  pages={131--135},
  doi={10.21437/VCCBC.2020-18}
}