12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Weight Optimization for Bimodal Unit-Selection Talking Head Synthesis

Asterios Toutios, Utpala Musti, Slim Ouni, Vincent Colotte

LORIA, France

This paper addresses talking head synthesis based on the concatenation of units comprising of both acoustic and visual information. Selection of appropriate diphone units to synthesize a given text string is based on the minimization of a weighted linear combination of four costs that reflect linguistic, acoustic, and visual considerations. We present initial work toward a method to determine automatically the weights applied to each cost, using a series of metrics that assess quantitatively the performance of synthesis.

Full Paper

Bibliographic reference.  Toutios, Asterios / Musti, Utpala / Ouni, Slim / Colotte, Vincent (2011): "Weight optimization for bimodal unit-selection talking head synthesis", In INTERSPEECH-2011, 2249-2252.