International Conference on Auditory-Visual Speech Processing 2008
Tangalooma Wild Dolphin Resort,
Moreton Island, Queensland, Australia
Cued Speech is a communication system that complements lip-reading with a small set of possible handshapes placed in different positions near the face. Developing a Cued Speech capable system is a time-consuming and difficult challenge. This paper focuses on how an existing bank of reference Cued Speech gestures, exhibiting natural dynamics for hand articulation and movements, could be reused for another speaker (augmenting some video or 3D talking heads). Any Cued Speech hand gesture should be recorded or considered with the concomitant facial locations that Cued Speech specifies to leverage the lip reading ambiguities (such as lip corner, chin, cheek and throat for French). These facial target points are moving along with head movements and because of speech articulation. The post-treatment algorithm proposed here will retarget synthesized hand gestures to another face, by slightly modifying the sequence of translations and rotations of the 3D hand. This algorithm preserves the coarticulation of the reference signal (including undershooting of the trajectories, as observed in fast Cued Speech) while adapting the gestures to the geometry, articulation and movements of the target face. We will illustrate how our Cued Speech capable audiovisual synthesizer - built using simultaneously recorded hand trajectories and facial articulation of a single French Cued Speech user - can be used as a reference signal for this retargeting algorithm. For the ongoing evaluation of our algorithm, an intelligibility paradigm has been retained, using natural videos for the face. The intelligibility of some video VCV sequences with composited hand gestures for Cued Speech is being measured using a panel of Cued Speech users.
Bibliographic reference. Bailly, Gérard / Fang, Yu / Elisei, Frédéric / Beautemps, Denis (2008): "Retargeting cued speech hand gestures for different talking heads and speakers", In AVSP-2008, 153-158.