ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Binocular photometric stereo acquisition and reconstruction for 3d talking head applications

Chaoyang Wang, Lijuan Wang, Yasuyuki Matsushita, Bojun Huang, Magnetro Chen, Frank K. Soong

In order to render a high quality, versatile 3D talking head, a stable, high frame rate AV data acquisition system is constructed. It can capture 3D position, surface orientation and albedo texture of the talking head video images along with the corresponding speech signals. The system consists of a computer controlled LED lighting subsystem; high speed stereo cameras; a microphone; and a computer for synchronous recording of multi-stream AV data. The visual image data collected is processed through a binocular photometric stereo 3D reconstruction pipeline. The pipeline automatically segments out the face; computes the depth map with binocular stereo; computes the normal map with photometric stereo; generates albedo texture; and finally constructs a high-detailed 3d model with depth and normal cues as constraints. By using the data collected with the built system, we can capture high quality dynamic facial performance, synchronized with the subject's uttered speech.


doi: 10.21437/Interspeech.2013-630

Cite as: Wang, C., Wang, L., Matsushita, Y., Huang, B., Chen, M., Soong, F.K. (2013) Binocular photometric stereo acquisition and reconstruction for 3d talking head applications. Proc. Interspeech 2013, 2748-2752, doi: 10.21437/Interspeech.2013-630

@inproceedings{wang13g_interspeech,
  author={Chaoyang Wang and Lijuan Wang and Yasuyuki Matsushita and Bojun Huang and Magnetro Chen and Frank K. Soong},
  title={{Binocular photometric stereo acquisition and reconstruction for 3d talking head applications}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={2748--2752},
  doi={10.21437/Interspeech.2013-630}
}