ISCA Archive SSW 2013
ISCA Archive SSW 2013

Vietnamese HMM-based speech synthesis with prosody information

Anh-Tuan Dinh, Thanh-Son Phan, Tat-Thang Vu, Chi Mai Luong

Generating natural-sounding synthetic voice is an aim of all text to speech system. To meet the goal, many prosody features have been used in full-context labels of an HMM-based Vietnamese synthesizer. In the prosody specification, POS and Intonation information are considered not as important as positional information. The paper investigates the impact of POS and Intonation tagging on the naturalness of HMM-based voice. It was discovered that, the POS and Intonation tags help reconstruct the duration and emotion in synthesized voice.

Index Terms: Vietnamese speech synthesis, tone characteristics, tonal language, prosody tagging, part-ofspeech, hidden Markov models


Cite as: Dinh, A.-T., Phan, T.-S., Vu, T.-T., Luong, C.M. (2013) Vietnamese HMM-based speech synthesis with prosody information. Proc. 8th ISCA Workshop on Speech Synthesis (SSW 8), 31-34

@inproceedings{dinh13_ssw,
  author={Anh-Tuan Dinh and Thanh-Son Phan and Tat-Thang Vu and Chi Mai Luong},
  title={{Vietnamese HMM-based speech synthesis with prosody information}},
  year=2013,
  booktitle={Proc. 8th ISCA Workshop on Speech Synthesis (SSW 8)},
  pages={31--34}
}