ISCA Archive SSW 2007
ISCA Archive SSW 2007

Syllable-based Thai duration model using multi-level linear regression and syllable accommodation

Chatchawarn Hansakunbuntheung, Hiroaki Kato, Yoshinori Sagisaka

This paper proposes a syllable-based Thai duration model using multi-level linear regression and syllable accommodation. To build a timing model reflecting control characteristics directly, we introduce two analysis results on hierarchical control characteristics. First analysis result showed that syllable is highly correlated to higher-phone-level timing controls, while phone differences by themselves do not affect higher control and contribute to local timing control only. Second one on the syllable accomodation showed that phone duration highly depends on local phone factors. These analysis results support a syllable-based hierarchical model proposed in this paper. Duration prediction experiments of 5-fold cross validation showed 46.73 and 32.37 ms in RMS error, and, 0.905 and 0.811 in correlation between measured and predicted duration at syllable and phone levels, respectively. The comparison of predicted precision showed that the proposed syllable-based multi-level duration model better performed than a conventional single-level phone duration model.


Cite as: Hansakunbuntheung, C., Kato, H., Sagisaka, Y. (2007) Syllable-based Thai duration model using multi-level linear regression and syllable accommodation. Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6), 356-361

@inproceedings{hansakunbuntheung07_ssw,
  author={Chatchawarn Hansakunbuntheung and Hiroaki Kato and Yoshinori Sagisaka},
  title={{Syllable-based Thai duration model using multi-level linear regression and syllable accommodation}},
  year=2007,
  booktitle={Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6)},
  pages={356--361}
}