8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Fujisaki Model based F0 contours in Vietnamese TTS

Dung Tien Nguyen (1), Mai Chi Luong (2), Bang Kim Vu (2), Hansjoerg Mixdorff (3), Huy Hoang Ngo (2)

(1) Vietnam National University, Viet Nam
(2) Vietnamese Academy of Science and Technology, Viet Nam
(3) Berlin University of Applied Sciences, Germany

The current paper presents preliminary work towards the integration of the Fujisaki model into the VnVoice Vietnamese TTS system, based on a set of rules to control the F0 contour. A speech corpus consisting of 20 sentences was compiled. Each of the sentences can have various meanings depending on the tone associated with a monosyllabic keyword which it contains. The corpus with a total of 46 sentences was recorded by a female speaker whose voice had also been used in the speech corpus for VnVoice, and labeled at the syllabic level. Tone contrast perception results and naturalness comparisons show that the Fujisaki model works well in modeling F0 contour of Vietnamese tones.

Full Paper

Bibliographic reference.  Nguyen, Dung Tien / Luong, Mai Chi / Vu, Bang Kim / Mixdorff, Hansjoerg / Ngo, Huy Hoang (2004): "Fujisaki model based F0 contours in vietnamese TTS", In INTERSPEECH-2004, 1429-1432.