Auditory-Visual Speech Processing 2005

British Columbia, Canada
July 24-27, 2005

Statistical Analysis and Synthesis of 3D Faces for Auditory-Visual Speech Animation

Takaaki Kuratate

ATR Human Information Science Laboratories, Japan

In this paper, we demonstrate a statistical approach for creating a 3D face from photographs by exploiting the face information gained from faces scanned into a large 3D face database. We also estimate facial expressions using this database, creating speech-related deformations used for talking head animation for auditory-visual speech research.

The database has 9 different face postures from over two hundred people and is analyzed by principal component analysis (PCA). A small set of feature points from the face and the profile silhouette line extracted from front and side view photographs are used to create a novel 3D face from PCA results by linear estimation. Any new neutral 3D face can be easily represented by fifty eigen vectors obtained by this PCA, and its deformation characteristics can be estimated from faces in the database that are close to the input face in the eigen vector space. The estimated facial expressions are quite natural. This same method can also be applied to human-like faces such as those found in statues or dolls. Additional PCA of the estimated face postures can then be used to create 3D talking head animation.

Bibliographic reference.  Kuratate, Takaaki (2005): "Statistical analysis and synthesis of 3d faces for auditory-visual speech animation", In AVSP-2005, 131-136.