Eighth ISCA Workshop on Speech Synthesis
Barcelona, Catalonia, Spain
Generating natural-sounding synthetic voice is an aim of all text to speech system. To meet the goal, many prosody features have been used in full-context labels of an HMM-based Vietnamese synthesizer. In the prosody specification, POS and Intonation information are considered not as important as positional information. The paper investigates the impact of POS and Intonation tagging on the naturalness of HMM-based voice. It was discovered that, the POS and Intonation tags help reconstruct the duration and emotion in synthesized voice. Index Terms: Vietnamese speech synthesis, tone characteristics, tonal language, prosody tagging, part-ofspeech, hidden Markov models
Bibliographic reference. Dinh, Anh-Tuan / Phan, Thanh-Son / Vu, Tat-Thang / Luong, Chi Mai (2013): "Vietnamese HMM-based speech synthesis with prosody information", In SSW8, 31-34.