9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

A Probabilistic Trajectory Synthesis System for Synthesising Visual Speech

Barry-John Theobald, Nicholas Wilkinson

University of East Anglia, UK

We describe an unsupervised probabilistic approach for synthesising visual speech from audio. Acoustic features representing a training corpus are clustered and the probability density function (PDF) of each cluster is modelled as a Gaussian mixture model (GMM). A visual target in the form of a short-term parameter trajectory is generated for each cluster. Synthesis involves combining the cluster targets based on the likelihood of novel acoustic feature vectors, then cross-blending neighbouring regions of the synthesised short-term trajectories. The advantage of our approach is coarticulation effects are explicitly captured by the mapping. The influence of cluster targets naturally increase and decrease with the likelihood of the acoustic feature vectors.

Full Paper

Bibliographic reference.  Theobald, Barry-John / Wilkinson, Nicholas (2008): "A probabilistic trajectory synthesis system for synthesising visual speech", In INTERSPEECH-2008, 1857-1860.