September 22-25, 1997
Ever since the publication of Bolt's ground-breaking "Put-That There" paper , providing multiple modalities as a means of easing the interaction between humans and computers has been a desirable attribute of user interface design. In Bolt's early approach, the style of modality combination required the user to conform to a rigid order when entering spoken and gestural commands. In the early 1990s, the idea of synergistic multimodal combination began to emerge , although actual implemented systems (generally using keyboard and mouse) remained far from being synergistic. Next-generation approaches involved time-stamped events to reason about the fusion of multimodal input arriving in a given time window, but these systems were hindered by time-consuming matching algorithms. To overcome this limitation, we proposed  a truly synergistic application and a distributed architecture for flexible interaction that reduces the need for explicit time stamping. Our slot-based approach is command directed, making it suitable for applications using speech as a primary modality. In this article, we use our interaction model to demonstrate that during multimodal fusion, speech should be a privileged modality, driving the interpretation of a query, and that in certain cases, speech has even more power to override and modify the combination of other modalities than previously believed.
Bibliographic reference. Julia, Luc E. / Cheyer, Adam J. (1997): "Speech: a privileged modality", In EUROSPEECH-1997, 1843-1846.