ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

A trainable trajectory formation model TD-HMM parameterized for the LIPS 2008 challenge

Gérard Bailly, Oxana Govokhina, Gaspard Breton, Frédéric Elisei, Christophe Savariaux

We describe here the trainable trajectory formation model that will be used for the LIPS'2008 challenge organized at InterSpeech'2008. It predicts articulatory trajectories of a talking face from phonetic input. It basically uses HMM-based synthesis but asynchrony between acoustic and gestural boundaries - taking for example into account non audible anticipatory gestures - is handled by a phasing model that predicts the delays between the acoustic boundaries of allophones to be synthesized and the gestural boundaries of HMM triphones. The HMM triphones and the phasing model are trained simultaneously using an iterative analysis-synthesis loop. Convergence is obtained within a few iterations. Using different motion capture data, we demonstrate here that the phasing model improves significantly the prediction error and captures subtle context-dependent anticipatory phenomena.


doi: 10.21437/Interspeech.2008-592

Cite as: Bailly, G., Govokhina, O., Breton, G., Elisei, F., Savariaux, C. (2008) A trainable trajectory formation model TD-HMM parameterized for the LIPS 2008 challenge. Proc. Interspeech 2008, 2318-2321, doi: 10.21437/Interspeech.2008-592

@inproceedings{bailly08_interspeech,
  author={Gérard Bailly and Oxana Govokhina and Gaspard Breton and Frédéric Elisei and Christophe Savariaux},
  title={{A trainable trajectory formation model TD-HMM parameterized for the LIPS 2008 challenge}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={2318--2321},
  doi={10.21437/Interspeech.2008-592}
}