Text-to-Prosody systems based on the use of prosodic databases extracted from natural speech will be a key point for further development of new Text-to-Speech systems. This paper describes a system using such speech databases to generate the rhythm and the intonation of a French written text. The system is based on a very crude chinks 'n chunks prosodic phrasing algorithm and on a prosodic analysis of a natural speech database. The rhythm of the synthetic speech is generated with a CART tree trained on a large mono-speaker speech corpus. The acoustic aspect of the intonation is derived from a set of prosodic patterns automatically derived from the same speech corpus. The system has been tested on single sentences and news paragraphs. Informal listening tests have shown that the resulting prosody is convincing most of the time.
Cite as: Malfrère, F., Dutoit, T., Mertens, P. (1998) Fully automatic prosody generator for text-to-speech. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0355, doi: 10.21437/ICSLP.1998-156
@inproceedings{malfrere98_icslp, author={Fabrice Malfrère and Thierry Dutoit and Piet Mertens}, title={{Fully automatic prosody generator for text-to-speech}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0355}, doi={10.21437/ICSLP.1998-156} }