7th International Conference on Spoken Language Processing
September 16-20, 2002
Cantonese is a major Chinese dialect with a complicated tone system. This research focuses on quantitative modeling of Cantonese tones. It uses Stem-ML, a language-independent framework for quantitative intonation modeling and generation. A set of F0 prediction models are built, and trained on acoustic data. The prediction error is about 11 Hz or 1 semitone. The resulting optimal model parameters are analyzed in accordance with linguistic knowledge. Key observations include: (1) There is no obvious advantage to model the entering tones separately. They can be considered as simply truncated versions of the non-entering tones; (2) Cantonese appears to have a declining phrase intonation; (3) Tones at initial positions of a phrase or a sentence tend to have a greater prosodic strength than those at the final positions; (4) Content words are stronger than function words; (5) Long words are stronger than short words.
Bibliographic reference. Lee, Tan / Kochanski, Greg / Shih, Chilin / Li, Yujia (2002): "Modeling tones in continuous Cantonese speech", In ICSLP-2002, 2401-2404.