Speech-based singing synthesis has various merits while it also has unsolved issues. One of the most noticeable issues is the segment duration and proportion in synthesised singing, which is caused by the difference in the short syllables in speech and the lengthened syllables in singing. This study therefore investigates how syllables are lengthened in Mandarin singing data. A total of 20 songs from the MIREX singing corpus were segmented and analysed. The results showed that (1) the segment proportions in Mandarin syllables are different in speech and in singing; (2) the lengthening is influenced more by the slots in the syllable structure than by the types of segments; (3) in the syllable structure of CGVX in Mandarin, the nuclear V lengthens the most and X follows. The durations of C and G also increase but their proportions in a syllable decrease.
Cite as: Zhang, C., Wang, X. (2020) Segment Duration and Proportion in Mandarin Singing. Proc. Speech Prosody 2020, 596-600, doi: 10.21437/SpeechProsody.2020-122
@inproceedings{zhang20c_speechprosody, author={Cong Zhang and Xinrong Wang}, title={{Segment Duration and Proportion in Mandarin Singing}}, year=2020, booktitle={Proc. Speech Prosody 2020}, pages={596--600}, doi={10.21437/SpeechProsody.2020-122} }