The OPPO System for the Blizzard Challenge 2020

Yang Song, Min Liang, Guilin Yang, Kun Xie, Jie Hao


This paper presents the OPPO text-to-speech system for Blizzard Challenge 2020. A statistical parametric speech synthesis based system was built with improvements in both frontend and backend. For the Mandarin task, a BERT model was used for the frontend, a Tacotron acoustic model and a WaveRNN vocoder model were used for the backend. For the Shanghainese task, the frontend was built from scratch, a Tacotron acoustic model and a MelGAN vocoder model were used for the backend. For the Mandarin task, evaluation results showed that our proposed system performed best in naturalness, and achieved near-best results in similarity. For the Shanghainese task, we got poor results in most indicators.


 DOI: 10.21437/VCC_BC.2020-3

Cite as: Song, Y., Liang, M., Yang, G., Xie, K., Hao, J. (2020) The OPPO System for the Blizzard Challenge 2020. Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 24-27, DOI: 10.21437/VCC_BC.2020-3.


@inproceedings{Song2020,
  author={Yang Song and Min Liang and Guilin Yang and Kun Xie and Jie Hao},
  title={{The OPPO System for the Blizzard Challenge 2020}},
  year=2020,
  booktitle={Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020},
  pages={24--27},
  doi={10.21437/VCC_BC.2020-3},
  url={http://dx.doi.org/10.21437/VCC_BC.2020-3}
}