5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Identification and Automatic Generation of Prosodic Contours for a Text-to-Speech Synthesis System in French

Stéphanie de Tournemire

France Telecom, CNET (Centre National d'Etudes des Telecommunications), Technopole Anticipa, Lannion, France

This paper presents the realisation of an automatically trainable computational prosodic model for French Text-to-Speech Synthesis. The methodology proposes the construction of the model in two steps. The first step consists in predicting fundamental frequency contours and duration of syllables from prosodic markers using neural networks [17,12]. In this step, the prosodic markers are automatically extracted from the signal by analysing prosodic realisations [2] and identifying a prosodic alphabet and a set of labelling rules. The second step integrates the model into the CNET Text-to-Speech Synthesis system [7] by using its linguistic levels and predicting prosodic markers from text and linguistic labels. The system is evaluated by nadve listeners and compared with the actual CNET Text-to-Speech Synthesis system.

Full Paper

Bibliographic reference.  Tournemire, Stéphanie de (1997): "Identification and automatic generation of prosodic contours for a text-to-speech synthesis system in French", In EUROSPEECH-1997, 191-194.