In this paper, we propose an effective method for predicting initial/final duration for corpus-based singing voice synthesis of Mandarin Chinese. The goal of the method is to improve the naturalness and clarity of the synthesized singing voices. To achieve this goal, we construct an individual initial/final (I/F) duration prediction model for each category of consonants. Support vector machine is used for duration prediction in each model. In order to achieve better accuracy, we use both linguistic/phonetic attributes and music-score information as the input features for the I/F duration prediction model. Experimental results demonstrate that the proposed method is effective in predicting the I/F duration for singing voice synthesis.
Bibliographic reference. Lin, Cheng-Yuan / Jao, Pei-Chi / Jang, J. -S. Roger (2007): "An effective initial/final duration prediction method for corpus-based singing voice synthesis of Mandarin Chinese", In INTERSPEECH-2007, 470-473.