SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms

Paul Yaozhu Chan, Minghui Dong, Grace Xue Hui Ho, Haizhou Li


Singing synthesis is a rising musical art form gaining popularity amongst composers and end-listeners alike. To date, this art form is largely confined to offline boundaries of the music studio, whereas a large part music is about live performances. This calls for a real-time synthesis system readily deployable for onstage applications.

SERAPHIM is a wavetable synthesis system that is lightweight and deployable on mobile platforms. Apart from conventional offline studio applications, SERAPHIM also supports real-time synthesis applications, enabling live control inputs for on-stage performances. It also provides for easy lip animation control. SERAPHIM will be made available as a toolbox on Unity 3D for easy adoption into game development across multiple platforms. A readily compiled version will also be deployed as a VST studio plugin, directly addressing end users. It currently supports Japanese (singing only) and Mandarin (speech and singing) languages. This paper describes our work on SERAPHIM and discusses its capabilities and applications.


DOI: 10.21437/Interspeech.2016-484

Cite as

Chan, P.Y., Dong, M., Ho, G.X.H., Li, H. (2016) SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms. Proc. Interspeech 2016, 1225-1229.

Bibtex
@inproceedings{Chan+2016,
author={Paul Yaozhu Chan and Minghui Dong and Grace Xue Hui Ho and Haizhou Li},
title={SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-484},
url={http://dx.doi.org/10.21437/Interspeech.2016-484},
pages={1225--1229}
}