Third ESCA/COCOSDA Workshop on Speech Synthesis

November 26-29, 1998
Jenolan Caves House, Blue Mountains, NSW, Australia

Automatic Prosody Generation Using Suprasegmental Unit Selection

F. Malfrère (1), Thierry Dutoit (1), Piet Mertens (2)

(1) Faculté Polytechnique de Mons, Belgium
(2) K. U. Leuven - Département de Linguistique, Belgium

Text-to-Prosody systems based on the use of prosodic databases extracted from natural speech will be a key point for further development of new Text-to-Speech systems.

This paper describes a system using such speech databases to generate the rhythm and the intonation of texts written in French. The system is based on a very crude chinks ’n chunks prosodic phrasing algorithm and on an automatic prosodic analysis of a natural speech database. The rhythm of the synthetic speech is generated with a CART tree trained on a large mono-speaker speech corpus. The acoustic aspect of the intonation is derived from a set of prosodic patterns automatically derived from the same speech corpus. At synthesis time, patterns are chosen on the fly from the database so as to minimize a total selection cost composed of pattern target costs and pattern concatenation costs.

Full Paper (with 2 sound examples linked from within the paper)

Bibliographic reference.  Malfrère, F. / Dutoit, Thierry / Mertens, Piet (1998): "Automatic Prosody Generation Using Suprasegmental Unit Selection", In SSW3-1998, 323-328.