Third International Conference on Spoken Language Processing (ICSLP 94)
In this paper we describe recent work in our laboratory on the use of computer vision techniques for real-time multi-modal interfaces. The methods described here allow the non-invasive perception of human users; no special markers or identifying features are assumed. Both user-independent and user-dependent algorithms for gesture recognition are used, depending on the context. We apply the same techniques used for recognition to the problem of generation of animated forms to accompany spoken language. Both realtime recognition and animation of facial gestures (e.g., a lip-synched "talking head") have been implemented within our framework.
Bibliographic reference. Pentland, Alex P. / Darrell, Trevor (1994): "Visual perception of human bodies and faces for multi-modal interfaces", In ICSLP-1994, 543-546.