International Symposium on Chinese Spoken Language Processing (ISCSLP 2002)

Taipei, Taiwan
August 23-24, 2002

Modeling Duration and Intonation in Mandarin Chinese Synthesis with a Neural Network

Hongwei Ding, Oliver Jokisch, Hans Kruschke

Dresden University of Technology, Germany

The prosody control plays an important role in the naturalness of synthesized speech. In previous work, great efforts have been made to generate rule-based or parameter-based prosodic models. In order to capture the complex interaction of different relevant prosodic factors, neural networks were recently employed. This paper presents a new method of learning and modeling duration and intonation in Mandarin Chinese synthesis with a neural network, which was proved to be an appropriate approach in our Mandarin synthesis system. The material for the study of prosodic components was extracted from a phonetically and prosodically labeled sentence database uttered by the same speaker as for the synthesis inventory. This paper reports the study of duration and intonation, the analysis of the database, the concept of neural network model and the evaluation of training results.


Full Paper

Bibliographic reference.  Ding, Hongwei / Jokisch, Oliver / Kruschke, Hans (2002): "Modeling duration and intonation in Mandarin Chinese synthesis with a neural network", In ISCSLP 2002, paper 24.