9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Domain-Specific Classification Methods for Disfluency Detection

Sebastian Germesin, Tilman Becker, Peter Poller

DFKI GmbH, Germany

Speech disfluencies are very common in our everyday life and considerably affect NLP systems, which makes systems that can detect or even repair them highly desirable. Previous research achieved good results in the field of disfluency detection but only in subsets of the disfluency types. The aim of this study was to develop a technology that is able to cope with a broad field of disfluency types. A thorough investigation of our corpus led us to a detection design where basic rule-matching techniques are complemented with machine learning and N-gram based approaches. In this paper, we describe the different detection techniques, each specialized on its own disfluency domain and the results we gained.

Full Paper

Bibliographic reference.  Germesin, Sebastian / Becker, Tilman / Poller, Peter (2008): "Domain-specific classification methods for disfluency detection", In INTERSPEECH-2008, 2518-2521.