INTERSPEECH 2004 - ICSLP
The current paper presents preliminary work towards the integration of the Fujisaki model into the VnVoice Vietnamese TTS system, based on a set of rules to control the F0 contour. A speech corpus consisting of 20 sentences was compiled. Each of the sentences can have various meanings depending on the tone associated with a monosyllabic keyword which it contains. The corpus with a total of 46 sentences was recorded by a female speaker whose voice had also been used in the speech corpus for VnVoice, and labeled at the syllabic level. Tone contrast perception results and naturalness comparisons show that the Fujisaki model works well in modeling F0 contour of Vietnamese tones.
Bibliographic reference. Nguyen, Dung Tien / Luong, Mai Chi / Vu, Bang Kim / Mixdorff, Hansjoerg / Ngo, Huy Hoang (2004): "Fujisaki model based F0 contours in vietnamese TTS", In INTERSPEECH-2004, 1429-1432.