ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Syllable HMM based Mandarin TTS and comparison with concatenative TTS

Zhiwei Shuang, Shiyin Kang, Qin Shi, Yong Qin, Lianhong Cai

This paper introduces a Syllable HMM based Mandarin TTS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results show that the Syllable HMM based Mandarin TTS system with a 5.3MB’s model size can achieve an overall quality close to a concatenative TTS system with 1GB’ data size.


doi: 10.21437/Interspeech.2009-145

Cite as: Shuang, Z., Kang, S., Shi, Q., Qin, Y., Cai, L. (2009) Syllable HMM based Mandarin TTS and comparison with concatenative TTS. Proc. Interspeech 2009, 1767-1770, doi: 10.21437/Interspeech.2009-145

@inproceedings{shuang09_interspeech,
  author={Zhiwei Shuang and Shiyin Kang and Qin Shi and Yong Qin and Lianhong Cai},
  title={{Syllable HMM based Mandarin TTS and comparison with concatenative TTS}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1767--1770},
  doi={10.21437/Interspeech.2009-145}
}