Technology and innovation is often inspired by nature. However, when technology enters the social domain, such as creating human-like voices or having human-like conversations, mimicry can become an objective rather than an inspiration. In this paper we argue that performance and acting can offer a radically different design agenda to the mimicry objective. We compare a human mimic’s vocal performance (Alec Baldwin) of a target voice (Donald Trump) with the synthesis and copy resynthesis of a cloned synthetic voice. We show the conversational speaking style of natural performance is still a challenge to recreate with modern synthesis methods, and that resynthesis is hampered by current limitations in speech alignment approaches. We conclude by discussing how voice puppetry where a human voice is used to drive a synthesis engine - could be used to advance the state-of-the-art and the challenges involved in developing a voice puppetry system.
Cite as: Aylett, M., Braude, D., Pidcock, C., Potard, B. (2019) Voice Puppetry: Exploring Dramatic Performance to Develop Speech Synthesis. Proc. 10th ISCA Workshop on Speech Synthesis (SSW 10), 117-120, doi: 10.21437/SSW.2019-21
@inproceedings{aylett19_ssw, author={Matthew Aylett and David Braude and Christopher Pidcock and Blaise Potard}, title={{Voice Puppetry: Exploring Dramatic Performance to Develop Speech Synthesis}}, year=2019, booktitle={Proc. 10th ISCA Workshop on Speech Synthesis (SSW 10)}, pages={117--120}, doi={10.21437/SSW.2019-21} }