ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Comparing text-driven and speech-driven visual speech synthesisers

Barry-John Theobald, Gavin Cawley, Andrew Bangham, Iain Matthews, Nicholas Wilkinson

We present a comparison of a text-driven and a speech driven visual speech synthesiser. Both are trained using the same data and both use the same Active Appearance Model (AAM) to encode and re-synthesise visual speech. Objective quality, measured using correlation, suggests the performance of both approaches is close, but subjective opinion ranks the text-driven approach significantly higher.


Cite as: Theobald, B.-J., Cawley, G., Bangham, A., Matthews, I., Wilkinson, N. (2008) Comparing text-driven and speech-driven visual speech synthesisers. Proc. Interspeech 2008, 2322

@inproceedings{theobald08c_interspeech,
  author={Barry-John Theobald and Gavin Cawley and Andrew Bangham and Iain Matthews and Nicholas Wilkinson},
  title={{Comparing text-driven and speech-driven visual speech synthesisers}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={2322}
}