ISCA Archive Prosody 2001
ISCA Archive Prosody 2001

Disfluency detection: classifying repeated words as planned or unplanned

Cynthia V. Girand, Alan Bell

Improving the accuracy of speech recognition depends partly on the ability to correctly recognize disfluencies in conversational speech. Conversational data contains instances of both unplanned repetitions (e.g. and itÂ’s filling in with with grass...) and planned repetitions (e.g. We have had very very poor luck with uh all of the core crops...). If a word repetition is improperly identified as a speech disfluency, important information contained in the speech signal could be lost. The results of this paper show that while some of the prosodic characteristics of duration and silent pause structure are similar in planned and unplanned repetitions, their duration structures differ significantly in certain respects. The most significant finding shows that the types of lexical items that occur in planned and unplanned repetitions are quite different. Fairly accurate (97%) classification of repetitions as either planned or unplanned is possible by considering the word category that the item belongs to.


Cite as: Girand, C.V., Bell, A. (2001) Disfluency detection: classifying repeated words as planned or unplanned. Proc. ITRW on Prosody in Speech Recognition and Understanding, paper 8

@inproceedings{girand01_prosody,
  author={Cynthia V. Girand and Alan Bell},
  title={{Disfluency detection: classifying repeated words as planned or unplanned}},
  year=2001,
  booktitle={Proc. ITRW on Prosody in Speech Recognition and Understanding},
  pages={paper 8}
}