Voice Puppetry: Exploring Dramatic Performance to Develop Speech Synthesis

Matthew Aylett, David Braude, Christopher Pidcock, Blaise Potard

Technology and innovation is often inspired by nature. However, when technology enters the social domain, such as creating human-like voices or having human-like conversations, mimicry can become an objective rather than an inspiration. In this paper we argue that performance and acting can offer a radically different design agenda to the mimicry objective. We compare a human mimic’s vocal performance (Alec Baldwin) of a target voice (Donald Trump) with the synthesis and copy resynthesis of a cloned synthetic voice. We show the conversational speaking style of natural performance is still a challenge to recreate with modern synthesis methods, and that resynthesis is hampered by current limitations in speech alignment approaches. We conclude by discussing how voice puppetry where a human voice is used to drive a synthesis engine - could be used to advance the state-of-the-art and the challenges involved in developing a voice puppetry system.

 DOI: 10.21437/SSW.2019-21

Cite as: Aylett, M., Braude, D., Pidcock, C., Potard, B. (2019) Voice Puppetry: Exploring Dramatic Performance to Develop Speech Synthesis. Proc. 10th ISCA Speech Synthesis Workshop, 117-120, DOI: 10.21437/SSW.2019-21.

  author={Matthew Aylett and David Braude and Christopher Pidcock and Blaise Potard},
  title={{Voice Puppetry: Exploring Dramatic Performance to Develop Speech Synthesis}},
  booktitle={Proc. 10th ISCA Speech Synthesis Workshop},