5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Automatic Detection of Sentence Boundaries and Disfluencies Based on Recognized Words

Andreas Stolcke (1), Elizabeth Shriberg (1), Rebecca Bates (2), Mari Ostendorf (2), Dilek Hakkani (1), Madelaine Plauche (1), Gokhan Tur (1), Yu Lu (1)

(1) SRI International, USA
(2) Boston University, USA

We study the problem of detecting linguistic events at interword boundaries, such as sentence boundaries and disfluency locations, in speech transcribed by an automatic recognizer. Recovering such events is crucial to facilitate speech understanding and other natural language processing tasks. Our approach is based on a combination of prosodic cues modeled by decision trees, and word-based event N-gram language models. Several model combination approaches are investigated. The techniques are evaluated on conversational speech from the Switchboard corpus. Model combination is shown to give a significant win over individual knowledge sources.

Full Paper

Bibliographic reference.  Stolcke, Andreas / Shriberg, Elizabeth / Bates, Rebecca / Ostendorf, Mari / Hakkani, Dilek / Plauche, Madelaine / Tur, Gokhan / Lu, Yu (1998): "Automatic detection of sentence boundaries and disfluencies based on recognized words", In ICSLP-1998, paper 0059.