In conversation, speakers spontaneously produce manual gestures that can facilitate listeners’ comprehension of speech. However, various factors may affect listeners’ ability to use gesture cues. Here we examine a situation where a speaker is referring to physical objects in the contextual here-and-now. In this situation, objects for potential reference will compete with gestures for visual attention. In two experiments, a speaker provided instructions to pick up objects in the visual environment (“ Pick up the candy”). On some trials, the speaker produced a “pick up” gesture that reflected the size/shape of the target object. Gaze position was recorded to evaluate how listeners allocated attention to scene elements. Experiment 1 showed that, although iconic gestures (when present) were rarely fixated directly, peripheral uptake of these cues speeded listeners’ visual identification of intended referents as the instruction unfolded. However, the benefit was mild and occurred primarily for small/hard-to-identify objects. In Experiment 2, background noise was added to reveal whether challenging auditory environments lead listeners to allocate additional visual attention to gesture cues in a compensatory manner. Interestingly, background noise actually reduced listeners’ use of gesture cues. Together the findings highlight how situational factors govern the use of visual cues during multimodal communication.
Cite as: Saryazdi, R., Chambers, C.G. (2017) Attentional Factors in Listeners’ Uptake of Gesture Cues During Speech Processing. Proc. Interspeech 2017, 869-873, doi: 10.21437/Interspeech.2017-1676
@inproceedings{saryazdi17_interspeech, author={Raheleh Saryazdi and Craig G. Chambers}, title={{Attentional Factors in Listeners’ Uptake of Gesture Cues During Speech Processing}}, year=2017, booktitle={Proc. Interspeech 2017}, pages={869--873}, doi={10.21437/Interspeech.2017-1676} }