7th International Conference on Spoken Language Processing
September 16-20, 2002
In a multimodal conversation, user inputs are usually abbreviated or imprecise. Only fusing inputs together is inadequate in reaching a full understanding. To address this problem, we have developed a context-based approach for multimodal interpretation. In particular, we present three operations: ordering, covering, and aggregation. Using feature structures that represent intention and attention identified from user inputs and the overall conversation, these operations provide a mechanism to combine multimodal fusion and context-based inference. These operations allow our system to process a variety of user multimodal inputs including those incomplete and ambiguous ones.
Bibliographic reference. Chai, Joyce (2002): "Operations for context-based multimodal interpretation in conversational systems", In ICSLP-2002, 2249-2252.