ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Lip synchronization: from phone lattice to PCA eigen-projections using neural networks

Samer Al Moubayed, Michael De Smet, Hugo Van hamme

Lip synchronization is the process of generating natural lip movements from a speech signal. In this work we address the lip-sync problem using an automatic phone recognizer that generates a phone lattice carrying posterior probabilities. The acoustic feature vector contains the posterior probabilities of all the phones over a time window centered at the current time point. Hence this representation characterizes the phone recognition output including the confusion patterns caused by its limited accuracy. A 3D face model with varying texture is computed by analyzing a video recording of the speaker using a 3D morphable model. Training a neural network using 30 000 data vectors from an audiovisual recording in Dutch resulted in a very good simulation of the face on independent data sets of the same or of a different speaker.


doi: 10.21437/Interspeech.2008-524

Cite as: Moubayed, S.A., Smet, M.D., Van hamme, H. (2008) Lip synchronization: from phone lattice to PCA eigen-projections using neural networks. Proc. Interspeech 2008, 2016-2019, doi: 10.21437/Interspeech.2008-524

@inproceedings{moubayed08_interspeech,
  author={Samer Al Moubayed and Michael De Smet and Hugo {Van hamme}},
  title={{Lip synchronization: from phone lattice to PCA eigen-projections using neural networks}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={2016--2019},
  doi={10.21437/Interspeech.2008-524}
}