9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Comparing Text-Driven and Speech-Driven Visual Speech Synthesisers

Barry-John Theobald (1), Gavin Cawley (1), Andrew Bangham (1), Iain Matthews (2), Nicholas Wilkinson (1)

(1) University of East Anglia, UK; (2) Weta Digital Limited, New Zealand

We present a comparison of a text-driven and a speech driven visual speech synthesiser. Both are trained using the same data and both use the same Active Appearance Model (AAM) to encode and re-synthesise visual speech. Objective quality, measured using correlation, suggests the performance of both approaches is close, but subjective opinion ranks the text-driven approach significantly higher.

Full Paper

Bibliographic reference.  Theobald, Barry-John / Cawley, Gavin / Bangham, Andrew / Matthews, Iain / Wilkinson, Nicholas (2008): "Comparing text-driven and speech-driven visual speech synthesisers", In INTERSPEECH-2008, 2322.