ISCA Archive SSW 1998
ISCA Archive SSW 1998

Automatic prosody generation using suprasegmental unit selection

F. Malfrère, Thierry Dutoit, Piet Mertens

Text-to-Prosody systems based on the use of prosodic databases extracted from natural speech will be a key point for further development of new Text-to-Speech systems.

This paper describes a system using such speech databases to generate the rhythm and the intonation of texts written in French. The system is based on a very crude chinks ’n chunks prosodic phrasing algorithm and on an automatic prosodic analysis of a natural speech database. The rhythm of the synthetic speech is generated with a CART tree trained on a large mono-speaker speech corpus. The acoustic aspect of the intonation is derived from a set of prosodic patterns automatically derived from the same speech corpus. At synthesis time, patterns are chosen on the fly from the database so as to minimize a total selection cost composed of pattern target costs and pattern concatenation costs.


Cite as: Malfrère, F., Dutoit, T., Mertens, P. (1998) Automatic prosody generation using suprasegmental unit selection. Proc. 3rd ESCA/COCOSDA Workshop on Speech Synthesis (SSW 3), 323-328

@inproceedings{malfrere98_ssw,
  author={F. Malfrère and Thierry Dutoit and Piet Mertens},
  title={{Automatic prosody generation using suprasegmental unit selection}},
  year=1998,
  booktitle={Proc. 3rd ESCA/COCOSDA Workshop on Speech Synthesis (SSW 3)},
  pages={323--328}
}