7th International Conference on Spoken Language Processing
September 16-20, 2002
The main scientific goal of the SmartKom project is to develop a new human-machine interaction metaphor for multimodal dialog systems. It combines speech, gesture, and facial expression input with speech, gesture and graphics output. The system is realized as a distributed collection of communicating and cooperating autonomous modules based on a multi-blackboard architecture. Multimodal output generation is consequently separated in two steps. First, the modality-specific output data are generated. Second, an inter-media synchronization of these data is realized on independent media devices to perform the multimodal presentation to the user. This paper describes the generation of appropriate lip animations that are based on a phonetic representation of the speech output signal and as a second computational step the timestamp based realization of audio-visual speech output on distributed media devices.
Bibliographic reference. Poller, Peter / Müller, Jochen (2002): "Distributed audio-visual speech synchronization", In ICSLP-2002, 205-208.