8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Complementarity and Redundancy in Multimodal User Inputs with Speech and Pen Gestures

Pui-Yu Hui, Zhengyu Zhou, Helen Meng

Chinese University of Hong Kong, China

We present a comparative analysis of multi-modal user inputs with speech and pen gestures, together with their semantically equivalent uni-modal (speech only) counterparts. The multimodal interactions are derived from a corpus collected with a Pocket PC emulator in the context of navigation around Beijing. We devise a cross-modality integration methodology that interprets a multi-modal input and paraphrases it as a semantically equivalent, uni-modal input. Thus we generate parallel multimodal (MM) and unimodal (UM) corpora for comparative study. Empirical analysis based on class trigram perplexities shows two categories of data: (PPMM = PPUM) and (PPMM < PPUM). The former involves complementarity across modalities in expressing the user's intent, including occurrences of ellipses. The latter involves redundancy, which will be useful for handling recognition errors by exploring mutual reinforcements. We present explanatory examples of data in these two categories.

Full Paper

Bibliographic reference.  Hui, Pui-Yu / Zhou, Zhengyu / Meng, Helen (2007): "Complementarity and redundancy in multimodal user inputs with speech and pen gestures", In INTERSPEECH-2007, 2205-2208.