ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Domain-specific classification methods for disfluency detection

Sebastian Germesin, Tilman Becker, Peter Poller

Speech disfluencies are very common in our everyday life and considerably affect NLP systems, which makes systems that can detect or even repair them highly desirable. Previous research achieved good results in the field of disfluency detection but only in subsets of the disfluency types. The aim of this study was to develop a technology that is able to cope with a broad field of disfluency types. A thorough investigation of our corpus led us to a detection design where basic rule-matching techniques are complemented with machine learning and N-gram based approaches. In this paper, we describe the different detection techniques, each specialized on its own disfluency domain and the results we gained.

doi: 10.21437/Interspeech.2008-624

Cite as: Germesin, S., Becker, T., Poller, P. (2008) Domain-specific classification methods for disfluency detection. Proc. Interspeech 2008, 2518-2521, doi: 10.21437/Interspeech.2008-624

  author={Sebastian Germesin and Tilman Becker and Peter Poller},
  title={{Domain-specific classification methods for disfluency detection}},
  booktitle={Proc. Interspeech 2008},