8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Recent Enhancements in CU VOCAL for Chinese TTS-Enabled Applications

Helen M. Meng, Yuk-Chi Li, Tien-Ying Fung, Man-Cheuk Ho, Chi-Kin Keung, Tin-Hang Lo, Wai-Kit Lo, P.C. Ching

Chinese University of Hong Kong, China

CU VOCAL is a Cantonese text-to-speech (TTS) engine. We use a syllable-based concatenative synthesis approach to generate intelligible and natural synthesized speech [1]. This paper describes several recent enhancements in CU VOCAL. First, we have augmented the syllable unit selection strategy with a positional feature. This feature specifies the relative location of a syllable in a sentence and serves to improve the quality of Cantonese tone realization. Second, we have developed the CU VOCAL SAPI engine, a version of the synthesizer that eases integration with applications using SAPI (Speech Application Programming Interface). We demonstrate the use of CU VOCAL SAPI in an electronic book (e-book) reader. Third, we have made an initial attempt to use the CU VOCAL SAPI engine in Web content authored with Speech Application Language Tags (SALT). The use of SALT tags can ease the task of invoking Cantonese TTS service on webpages.

Full Paper

Bibliographic reference.  Meng, Helen M. / Li, Yuk-Chi / Fung, Tien-Ying / Ho, Man-Cheuk / Keung, Chi-Kin / Lo, Tin-Hang / Lo, Wai-Kit / Ching, P.C. (2003): "Recent enhancements in CU VOCAL for Chinese TTS-enabled applications", In EUROSPEECH-2003, 1253-1256.