FAAVSP - The 1st Joint Conference on
Facial Analysis, Animation, and
In this paper we incorporate dynamic visemes into hidden Markov model (HMM)-based visual speech synthesis. Dynamic visemes represent intuitive visual gestures identified automatically by clustering purely visual speech parameters. They have the advantage of spanning multiple phones and so they capture the effects of visual coarticulation explicitly within the unit. The previous application of dynamic visemes to synthesis used a sample-based approach, where cluster centroids were concatenated to form parameter trajectories corresponding to novel visual speech. In this paper we generalize the use of these units to create more flexible and dynamic animation using a HMM-based synthesis framework. We show using objective and subjective testing that aHMMsynthesizer trained using dynamic visemes can generate better visual speech than HMM synthesizers trained using either phone or traditional viseme units. Index Terms: visual speech synthesis, hidden Markov model, dynamic visemes
Bibliographic reference. Thangthai, Ausdang / Theobald, Barry-John (2015): "HMM-based visual speech synthesis using dynamic visemes", In FAAVSP-2015, 88-92.