ISCA Archive DiSS 2005
ISCA Archive DiSS 2005

Gesture marking of disfluencies in spontaneous speech

Yelena Yasinnik, Stefanie Shattuck-Hufnagel, Nanette Veilleux

Speakers effectively use both visual and acoustic cues to convey information in speech. While earlier research has concentrated on the association of visual cues (provided by gestures) with fluent prosodic structure, this study looks at the relationship between visual cues, prosodic markers and spoken disfluencies. Preliminary results suggested that speakers preferentially perform gestures in the eye region in spoken disfluencies, but a more careful frame-by-frame analysis capturing all gestures revealed that movements of the eye region (blinks, frowns, eyebrow raises and changes in direction of eyegaze) occur with high frequency in both fluent and non-fluent speech. The paper describes a method for frame-by-frame labelling of speech- accompanying gestures for a speech sample, whose output can then be combined with independently derived labels of the prosody. Initial analysis of 3 minute samples from two speakers reveals that one speaker produces eye movements in association with disfluencies and the other does not, and that this tendency does not result from alignment of brow gestures with pitch accents.

Cite as: Yasinnik, Y., Shattuck-Hufnagel, S., Veilleux, N. (2005) Gesture marking of disfluencies in spontaneous speech. Proc. Disfluency in Spontaneous Speech (DiSS 2005), 173-178

  author={Yelena Yasinnik and Stefanie Shattuck-Hufnagel and Nanette Veilleux},
  title={{Gesture marking of disfluencies in spontaneous speech}},
  booktitle={Proc. Disfluency in Spontaneous Speech (DiSS 2005)},