Off the Cuff: Exploring Extemporaneous Speech Delivery with TTS

Éva Székely, Gustav Eje Henter, Jonas Beskow, Joakim Gustafson


Extemporaneous speech is a delivery type in public speaking which uses a structured outline but is otherwise delivered conversationally, off the cuff. This demo uses a natural-sounding spontaneous conversational speech synthesiser to simulate this delivery style. We resynthesised the beginnings of two Interspeech keynote speeches with TTS that produces multiple different versions of each utterance that vary in fluency and filled-pause placement. The platform allows the user to mark the samples according to any perceptual aspect of interest, such as certainty, authenticity, confidence, etc. During the speech delivery, they can decide on the fly which realisation to play, addressing their audience in a connected, conversational fashion. Our aim is to use this platform to explore speech synthesis evaluation options from a production perspective and in situational contexts.


Cite as: Székely, É., Henter, G.E., Beskow, J., Gustafson, J. (2019) Off the Cuff: Exploring Extemporaneous Speech Delivery with TTS. Proc. Interspeech 2019, 3687-3688.


@inproceedings{Székely2019,
  author={Éva Székely and Gustav Eje Henter and Jonas Beskow and Joakim Gustafson},
  title={{Off the Cuff: Exploring Extemporaneous Speech Delivery with TTS}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={3687--3688}
}