Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Natural F0 Contours with a New Neural-Network-Hybrid Approach

Caglayan Erdem (1,2), Martin Holzapfel (2), Rüdiger Hoffmann (1)

(1) Dresden University of Technology, Dresden, Germany
(2) Siemens AG Corporate Technology, Munich, Germany

Text-to-Speech (TTS) systems still suffer from unnatural prosody generation. To increase customers acceptance a more sophisticated prosody modelling is required. In this paper a new hybrid approach combining the advantages of two existing state-of-the-art modelling strategies is presented.

After presenting two state-of-the-art approaches with their advantages and shortcomings in section 1 we will discuss the new architecture of the hybrid approach in section 2 outlining the data driven interconnection of the two base approaches. Finally a search performed on the database will be presented using a fuzzy motivated nonlinear parametric cost and suitability function for obtaining desired fo-control parameters. The hybrid approach improved our fo-generation Module within our TTS system PAPAGENO.

Full Paper

Bibliographic reference.  Erdem, Caglayan / Holzapfel, Martin / Hoffmann, Rüdiger (2000): "Natural F0 contours with a new neural-network-hybrid approach", In ICSLP-2000, vol.3, 227-230.