Auditory-Visual Speech Processing (AVSP) 2009
University of East Anglia, Norwich, UK
This paper presents an image-based talking head system, which includes two parts: analysis and synthesis. The analysis is to create a database containing a large number of mouth images and their associated facial and speech features. The synthesis is to generate realistic facial animations from phonetic transcripts of text. The facial animation is produced by selecting and concatenating appropriate mouth images that match the spoken words of the talking head. Subjective tests show that 60% of the animations are indistinguishable from real recordings.
Index Terms: talking head, unit selection, evaluation
Bibliographic reference. Liu, Kang / Ostermann, Joern (2009): "An image-based talking head system", In AVSP-2009, 166.