INTERSPEECH 2014
15th Annual Conference of the International Speech Communication Association

Singapore
September 14-18, 2014

Adapting Prosodic Chunking Algorithm and Synthesis System to Specific Style: The Case of Dictation

Elisabeth Delais-Roussarie (1), Damien Lolive (2), Hiyon Yoo (1), Nelly Barbot (2), Olivier Rosec (3)

(1) LLF (UMR 7110), France
(2) IRISA, France
(3) Voxygen, France

In this paper, we present an approach that allows a TTS-system to dictate texts to primary school pupils, while being in conformity with the prosodic features of this speaking style. The approach relies on the elaboration of a preprocessing prosodic module that avoids developing a specific system for a so limited task. The proposal is based on two distinct elements: (i) the results of a preliminary evaluation that allowed getting feedback from potential users; (ii) a corpus study of 10 dictations annotated or uttered by 13 teachers or speech therapists (10 and 3 respectively).
   The preliminary evaluation focused on three points: the accuracy of the segmentation procedure, the size of the automatically calculated chunks, and the intelligibility of the synthesized voice. It showed that the chunks were judged too long, and the speaking rate too fast. We thus decided to work on these two issues while analyzing the collected data, and confronting the obtained realizations with the outcome of the speech synthesis system and the chunking algorithm. The results of the analysis lead to propose a module that provides for this speaking style an enriched text that can be treated by the synthesizer to constrain the unit selection and the prosodic realization.

Full Paper

Bibliographic reference.  Delais-Roussarie, Elisabeth / Lolive, Damien / Yoo, Hiyon / Barbot, Nelly / Rosec, Olivier (2014): "Adapting prosodic chunking algorithm and synthesis system to specific style: the case of dictation", In INTERSPEECH-2014, 1673-1677.