This paper presents an embedded and concatenative approach to multilingual text-to-speech system (ECMTTS). Under a uniform architecture, the TTS modules are separated into language dependent and independent ones. A specifically defined super phonetic symbol set enables to use uniform speech unit for concatenation, and an elaborately indexing and storing approach can reduce the size of speech inventory. The TTS system employs an improved cost function-based unit selection strategy, an efficient speech synthesizer, and refined concatenation approach to balance the speech quality and memory size as well as computation requirement on embedded platforms.
Cite as: Chen, G.-L., Han, K.-S., Yu, Z.-L., Yue, D.-J., Zu, Y.-Q. (2005) An embedded and concatenative approach to TTS of multiple languages. Proc. Interspeech 2005, 2541-2544, doi: 10.21437/Interspeech.2005-790
@inproceedings{chen05g_interspeech, author={Gui-Lin Chen and Ke-Song Han and Zhen-Li Yu and Dong-Jian Yue and Yi-Qing Zu}, title={{An embedded and concatenative approach to TTS of multiple languages}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={2541--2544}, doi={10.21437/Interspeech.2005-790} }