Sixth International Conference on Spoken Language Processing
Text-to-Speech (TTS) systems still suffer from unnatural prosody generation. To increase customers acceptance a more sophisticated prosody modelling is required. In this paper a new hybrid approach combining the advantages of two existing state-of-the-art modelling strategies is presented.
After presenting two state-of-the-art approaches with their advantages and shortcomings in section 1 we will discuss the new architecture of the hybrid approach in section 2 outlining the data driven interconnection of the two base approaches. Finally a search performed on the database will be presented using a fuzzy motivated nonlinear parametric cost and suitability function for obtaining desired fo-control parameters. The hybrid approach improved our fo-generation Module within our TTS system PAPAGENO.
Bibliographic reference. Erdem, Caglayan / Holzapfel, Martin / Hoffmann, Rüdiger (2000): "Natural F0 contours with a new neural-network-hybrid approach", In ICSLP-2000, vol.3, 227-230.