NUS Speak-to-Sing: A Web Platform for Personalized Speech-to-Singing Conversion

Chitralekha Gupta, Karthika Vijayan, Bidisha Sharma, Xiaoxue Gao, Haizhou Li


Singing like a professional singer is extremely appealing to the general public. However, many individuals are not able to sing like a singer who has received formal training over several years. We develop a web platform, where users can perform personalized singing synthesis. A user has to read and record the lyrics of a song in our web platform, and enjoy good quality singing vocals synthesized in his/her own voice. We perform a template-based speech-to-singing voice conversion at the backend of the web interface, that uses the prosody characteristics of the song derived from good quality singing by a trained singer and retains the speaker characteristics from the respective user. We utilize an improved temporal alignment scheme between speech and singing signals using tandem features, and employ a deep-spectral map to incorporate singing spectral characteristics into user’s voice. The singing vocals are later synthesized by a vocoder. Using this web platform, we advocate that ‘everyone can sing as they desire’.


Cite as: Gupta, C., Vijayan, K., Sharma, B., Gao, X., Li, H. (2019) NUS Speak-to-Sing: A Web Platform for Personalized Speech-to-Singing Conversion. Proc. Interspeech 2019, 2376-2377.


@inproceedings{Gupta2019,
  author={Chitralekha Gupta and Karthika Vijayan and Bidisha Sharma and Xiaoxue Gao and Haizhou Li},
  title={{ NUS Speak-to-Sing: A Web Platform for Personalized Speech-to-Singing Conversion}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={2376--2377}
}