Auditory-Visual Speech Processing 2005
British Columbia, Canada
In this paper, we present the 3D acquisition infrastructure we developed for building a talking face and studying some aspects of visual speech. A short-term aim is to study coarticulation for the French language and to develop a model which respects a real talker articulation. One key factor is to be able to acquire a large amount of 3D data with a low-cost system more flexible than existing motion capture systems (using infrared cameras and glued markers).
Our system only uses two standard cameras, a PC and painted markers that do not change speech articulation and provides a sufficiently fast acquisition rate to enable an efficient temporal tracking of 3D points. We present here our stereovision data capture system and how these data can be used in acoustic-to-articulatory inversion.
Bibliographic reference. Wrobel-Dautcourt, Brigitte / Berger, M. O. / Potard, B. / Laprie, Yves / Ouni, Slim (2005): "A low-cost stereovision based system for acquisition of visible articulatory data", In AVSP-2005, 145-150.