Sixth International Conference on Spoken Language Processing
October 16-20, 2000
Using Cross-Syllable Units for Cantonese Speech Synthesis
Ka Man Law, Tan Lee
Department of Electronic Engineering,
The Chinese University of Hong Kong
Monosyllables have been widely accepted as the basic units
for concatenative speech synthesis of Chinese dialects.
However, concatenating individual syllables is not adequate
to produce highly natural synthetic speech because of the
improper coupling at syllable boundaries. This paper
describes a preliminary research of using cross-syllable units
for Cantonese speech synthesis. The acoustic inventory
contains 1725 cross-syllable units, which are excised from
properly selected and recorded carrier words. TD-PSOLA is
employed for prosodic modification of synthetic speech.
The results of subjective listening tests reveal that the
proposed use of cross-syllable units has potential in
producing highly natural synthetic speech, although the
currently achieved performance is only fair. Substantial
improvement is anticipated with better smoothing technique
for waveform concatenation and greater coverage of
context-dependent variation of the acoustic units.
Law, Ka Man / Lee, Tan (2000):
"Using cross-syllable units for Cantonese speech synthesis",
In ICSLP-2000, vol.2, 407-410.