Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Multi-Strategy Data Mining on Mandarin Prosodic Patterns

Yiqiang Chen (1), Wen Gao (1), Tingshao Zhu (1), Jiyong Ma (2)

(1) Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
(2) Dept. of Computing Science, University of Alberta, Edmonton, Canada

Mandarin prosodic models are very important in speech research and synthesis, which mainly describes the variation of pitch. The models that are now being used in most Chinese Text-To-Speech systems are constructed by expert, qualitatively and with low precision. In this paper, we propose a Multi-strategy Data Mining framework to extract prosodic patterns from actual large Mandarin speech database to improve the naturalness and intelligibility of synthesized speech. In data preprocessing, typical prosody models are found by clustering analysis, and Rough Set is employed for feature selection. ANN and Decision tree are trained respectively. The prediction result of ANN and Decision Tree are integrated to generate fundamental frequency and energy contours. The experimental results showed that synthesized prosodic features quite resembled their original counterparts for most syllables.


Full Paper

Bibliographic reference.  Chen, Yiqiang / Gao, Wen / Zhu, Tingshao / Ma, Jiyong (2000): "Multi-strategy data mining on Mandarin prosodic patterns", In ICSLP-2000, vol.2, 59-62.