Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Learning the Parameters of Quantitative Prosody Models

Oliver Jokisch, Hansjörg Mixdorff, Hans Kruschke, Ulrich Kordon

Laboratory of Acoustics and Speech Communication, Dresden University of Technology, Germany

The article introduces a novel hybrid data driven and rule based approach for the prosody control in a TTS system, which combines the advantages of well-balanced, quantitative models with the flexible training of derived model parameters. Instancing the training of Fujisaki intonation parameters for German (MFGI) the article describes the hybrid data driven and rule based architecture HYDRA, the speech database, the extraction of the model parameters and the neural network (NN) training of these parameters. Preliminary results using the hybrid intonation model are presented. A hybrid neural network and rule based, quantitative model can be easily parameterized and adapted e.g. for multilingual applications, but has a higher complexity and requires the automatic extraction of the model parameters from a speech database.

Full Paper

Bibliographic reference.  Jokisch, Oliver / Mixdorff, Hansjörg / Kruschke, Hans / Kordon, Ulrich (2000): "Learning the parameters of quantitative prosody models", In ICSLP-2000, vol.1, 645-648.