Anticipation of Turn-Switching in Auditory-Visual Dialogs

Hansjörg Mixdorff (1), Angelika Hönemann (2), Jeesun Kim (3), Chris Davis (3)

(1) Department of Computer Science and Media, Beuth University Berlin, Germany
(2) University of Bielefeld, CITEC, Germany
(3) MARCS Institute, University of Western Sydney, Australia

This paper presents an experiment in which we examined whether German and Australian English perceivers were able to predict imminent turn-switching in Australian English auditory-visual dialogs. Subjects were presented excerpts of one and four second duration either preceding a switch or taken from inside a turn and had to decide which condition they saw. Stimuli were either A/V, video-only or audio-only. Results on the one second excerpts were close to random. In general we found a preference for non-switching. Australian subjects outperformed the German subjects in the audio-only condition, but outcomes were almost equal on the A/V stimuli. Analysis regarding the syntactic and prosodic properties of the stimuli showed that phrase-final statement as well as question intonation facilitated recognition presumably due to these acting as markers of turn-switch preparation; whereas incomplete sentences and non-terminal intonation were indicative of turn-internal excerpts. As to visual cues signaling a following switch results were rather varied. An open mouth on the part of the listener more often preceded switches than not. Index Terms: auditory-visual prosody, dialog, turn-switching

