A specific test was set up to evaluate how the elision or the pronunciation of the mute "e" inside plurisyllabic words can influence the correct identification of common words and proper names in French synthesis. The results show primarily that in order to obtain identification scores identical to those obtained with natural speech (where the mute "e" is usually elided in this position), the current quality of the CNET TTS system requires the systematic pronunciation of this mute "e". However, the decrease in identification scores observed when the mute"e" was elided in synthetic speech was notably reduced by "doubling" the surrounding consonants. Some consequences are drawn for the design of a new set of speech units, within the context of concatenation-based French synthesis.
Cite as: Larreur, D., Sorin, C. (1990) Quality evaluation of French text-to-speech synthesis within a task the importance of the mute "e". Proc. First ESCA Workshop on Speech Synthesis (SSW 1), 91-96
@inproceedings{larreur90_ssw, author={Danièle Larreur and Christel Sorin}, title={{Quality evaluation of French text-to-speech synthesis within a task the importance of the mute "e"}}, year=1990, booktitle={Proc. First ESCA Workshop on Speech Synthesis (SSW 1)}, pages={91--96} }