Eighth ISCA Workshop on Speech Synthesis

Barcelona, Catalonia, Spain
August 31-September 2, 2013

Vietnamese HMM-based Speech Synthesis with Prosody Information

Anh-Tuan Dinh (1), Thanh-Son Phan (2), Tat-Thang Vu (1), Chi Mai Luong (1)

(1) Institute of Information Technology, Vietnam Academy of Science and Technology, Hanoi, Vietnam
(2) Faculty of Information Technology, Le Qui Don Technical University, Hanoi, Vietnam

Generating natural-sounding synthetic voice is an aim of all text to speech system. To meet the goal, many prosody features have been used in full-context labels of an HMM-based Vietnamese synthesizer. In the prosody specification, POS and Intonation information are considered not as important as positional information. The paper investigates the impact of POS and Intonation tagging on the naturalness of HMM-based voice. It was discovered that, the POS and Intonation tags help reconstruct the duration and emotion in synthesized voice. Index Terms: Vietnamese speech synthesis, tone characteristics, tonal language, prosody tagging, part-ofspeech, hidden Markov models

Full Paper

Bibliographic reference.  Dinh, Anh-Tuan / Phan, Thanh-Son / Vu, Tat-Thang / Luong, Chi Mai (2013): "Vietnamese HMM-based speech synthesis with prosody information", In SSW8, 31-34.