ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

A sequential repetition model for improved disfluency detection

Mari Ostendorf, Sangyun Hahn

This paper proposes a new method for automatically detecting disfluencies in spontaneous speech . specifically, self-corrections . that explicitly models repetitions vs. other disfluencies. We show that, in a corpus of Supreme Court oral arguments, repetition disfluencies can be longer and more stutter-like than the short repetitions observed in the Switchboard corpus and suggest that they can be better represented with a flat structure that covers the full sequence. Since these disfluencies are relatively easy to detect, weakly supervised training is an effective way to minimize labeling costs. By explicitly modeling these, we improve general disfluency detection within and across domains, and we provide a richer transcript.


doi: 10.21437/Interspeech.2013-604

Cite as: Ostendorf, M., Hahn, S. (2013) A sequential repetition model for improved disfluency detection. Proc. Interspeech 2013, 2624-2628, doi: 10.21437/Interspeech.2013-604

@inproceedings{ostendorf13_interspeech,
  author={Mari Ostendorf and Sangyun Hahn},
  title={{A sequential repetition model for improved disfluency detection}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={2624--2628},
  doi={10.21437/Interspeech.2013-604}
}