International Symposium on Chinese Spoken Language Processing (ISCSLP 2002)

Taipei, Taiwan
August 23-24, 2002

The Effect of Tonal Context on Cantonese Concatenative Speech Synthesis

Tien-Ying Fung, Helen Meng

Human-Computer Communications Laboratory, Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong

This paper describes our study of the effect of tonal context on Cantonese concatenative speech synthesis. We have previously developed a speech synthesizer, CU VOCAL, that concatenates syllables to generate Cantonese and Mandarin speech [1, 2]. The preliminary version of CU VOCAL captures only the place of articulation as coarticulatory context by the use of distinctive features in unit selection [3]. However, we noticed discrepancies between the perceived tone and the desired tone for some Cantonese syllables in the synthesized speech, which affected the perceived quality of the synthesis outputs. This suggests the need to extend our unit selection strategy to incorporate tonal context as well. In order to devise such a strategy, we studied the comparative importance between the left and right tonal contexts in terms of their influence on the perceived tone of the current syllable. We also defined a scheme by which we can measure the difference between a desired syllable token and its tonal variant, in terms of attributes such as tone shape, tone height and tone trajectory. Hence, if a desired syllable token is unavailable during concatenative synthesis, we can substitute with its "closest" tonal variant as suggested by our unit selection scheme.

References

  1. Fung, T. Y. and H. Meng, "Concatenating Syllables for Response Generation in Spoken Language Applications," Proceedings of ICASSP 2000.
  2. Meng, H. et al, "CU VOCAL: Corpus-based Syllable Concatenation for Chinese Speech Synthesis across Domains and Dialects," Proceedings of the International Conference on Spoken Language, 2002.
  3. Rabiner, L. R. and Schafer, R. W. "Digital Processing of Speech Signals" pages 39-41, Prentice-Hall, 1978.


Full Paper

Bibliographic reference.  Fung, Tien-Ying / Meng, Helen (2002): "The effect of tonal context on cantonese concatenative speech synthesis", In ISCSLP 2002, paper 66.