ISCA Archive SPECOM 2004
ISCA Archive SPECOM 2004

Speech driven MPEG-4 facial animation for Turkish

Arman Savran, Levent M. Arslan, Lale Akarun

In this study, a system, that generates visual speech by synthesizing 3D face points, has been implemented. The synthesized face points drive MPEG-4 facial animation. To produce realistic and natural speech animation, a codebook based technique, which is trained with audio-visual data from a speaker, was employed. An audio-visual speech database was created using a 3D facial motion capture system that was developed for this study. To improve the performance of the system when used by different speakers, a further training was performed with audio-only data from a small number of speakers. The resulting system is capable of animating faces from an input speech of any Turkish speaker.


Cite as: Savran, A., Arslan, L.M., Akarun, L. (2004) Speech driven MPEG-4 facial animation for Turkish. Proc. 9th Conference on Speech and Computer (SPECOM 2004), 57-64

@inproceedings{savran04_specom,
  author={Arman Savran and Levent M. Arslan and Lale Akarun},
  title={{Speech driven MPEG-4 facial animation for Turkish}},
  year=2004,
  booktitle={Proc. 9th Conference on Speech and Computer (SPECOM 2004)},
  pages={57--64}
}