EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Analysis and Modeling of Syllable Duration for Thai Speech Synthesis

Chatchawarn Hansakunbuntheung (1), Virongrong Tesprasit (1), Rungkarn Siricharoenchai (1), Yoshinori Sagisaka (2)

(1) NECTEC, Thailand
(2) Waseda University, Japan

This paper describes the analysis results on the control factors of Thai syllable duration, and a statistical control model using linear regression technique. The analyses have been carried out both at a syllable level and at a phrase level. In a syllable level duration control, the effects of five Thai tones and syllable structures are investigated. To analyze syllable structure effects statistically, we applied the quantification theory with two linguistic factors: (1) phone categories by themselves, and (2) the categories grouped by articulatory similarities. In a phrase level, the effects of position in a phrase and syllable counts in a phrase were analyzed. The experimental results showed that tones, syllable structures, and position in a phrase play significant roles on syllable duration control. Syllable counts in a phrase slightly affects the syllable duration. These analysis results have been integrated into a statistical control model. The duration assignment precision of the proposed model is evaluated using 2480-word speech data. Total correlation 0.73 between predicted values and observed values for test set samples shows the fair precision of the proposed control model.

Full Paper

Bibliographic reference.  Hansakunbuntheung, Chatchawarn / Tesprasit, Virongrong / Siricharoenchai, Rungkarn / Sagisaka, Yoshinori (2003): "Analysis and modeling of syllable duration for Thai speech synthesis", In EUROSPEECH-2003, 93-96.