7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Audiovisual Speech Synthesis. From Ground Truth to Models

Gérard Bailly

Institut de la Communication Parlée/INPG, France

We present here the main approaches used to synthesize and drive talking faces. Illustrative systems are described. We distinguish between facial synthesis itself (i.e the manner in which facial movements are rendered on a computer screen), and the way these movements may be controlled and predicted using phonetic input. We then focus on the necessity to capture, model and render with maximum fidelity the intimate coherence of the facial deformations observed on a human face.

Full Paper

Bibliographic reference.  Bailly, Gérard (2002): "Audiovisual speech synthesis. from ground truth to models", In ICSLP-2002, 1453-1456.