Auditory-Visual Speech Processing 2005

British Columbia, Canada
July 24-27, 2005

ArtiSynth: An Extensible, Cross-Platform 3D Articulatory Speech Synthesizer

Sidney Fels (1), Florian Vogt (1), Kees van den Doel (2), John E. Lloyd (2), Oliver Guenther (1)

(1) Department of Electrical and Computer Engineering; (2) Department of Computer Science, University of British Columbia, Vancouver, Canada

We describe our progress on the construction of a combined 3D face and vocal tract simulator for articulatory speech synthesis called ArtiSynth. The architecture provides six main modules: (1) a simulator engine and synthesis framework, (2) a two and three-dimensional model development component, (3) a numerics engine, (4) a graphical renderer, (5) an audio synthesis engine and (6) a graphical user interface (GUI). We have created infrastructure for creating vocal tract models based on combinations of rigid body, spring-mass, and finite element models, and some parametric models. Our infrastructure provides mechanisms to ``glue'' these and other model types together to create hybrids. Dynamical models whose equations of motion are integrated numerically and animatable parametric models are combined in a single framework. Using ArtiSynth we have created a complex, dynamic jaw model based on muscle models, a parametric tongue model, a face model, two lip models, and a source-filter based acoustic model linked to the vocal tract model via an airway model. These have been connected together to form a complete vocal tract that produces speech and is drivable both by data and by dynamics.

