15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Speech Synthesis in Various Communicative Situations: Impact of Pronunciation Variations

Sandrine Brognaux (1), Benjamin Picart (2), Thomas Drugman (2)

(1) Université catholique de Louvain, Belgium
(2) Université de Mons, Belgium

While current research in speech synthesis focuses on the generation of various speaking styles or emotions, very few studies have addressed the possibility of including phonetic variations according to the communicative situation of the target speech (sports commentaries, TV news, etc.). However, significant phonetic variations have been observed, depending on various communicative factors (e.g. spontaneous/read and media broadcast or not). This study analyzes whether these alternative pronunciations contribute to the plausibility of the message and should therefore be considered in synthesis. To this end, subjective tests are performed on synthesized French sports commentaries. They aim at comparing HMM-based speech synthesis with genuine pronunciation and with neutral NLP-produced phonetization. Results show that the integration of the phonetic variations significantly improves the perceived naturalness of the generated speech. They also highlight the relative importance of the various types of variations and show that schwa elisions, in particular, play a crucial role in that respect.

Full Paper

Bibliographic reference.  Brognaux, Sandrine / Picart, Benjamin / Drugman, Thomas (2014): "Speech synthesis in various communicative situations: impact of pronunciation variations", In INTERSPEECH-2014, 1524-1528.