Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Detection of Filled Pauses in Spontaneous Conversational Speech

Marcel Gabrea, Douglas O’Shaughnessy

INRS-Télécommunications, Québec, Canada

Most automatic speech recognition work has concentrated on read speech, whose acoustic aspects differ significantly from speech found in actual dialogues. A primary difference between read speech and spontaneous speech concerns a high rate of disfluencies (e.g., filled pauses, repetitions, repairs, false starts). Filled pauses (e.g., "uh," "um"), unlike silences, resemble phones as part of words in continuous speech. In this paper the problem of detection of filled pauses in spontaneous speech and how this can be useful in automatic speech recognition are considered. The acoustic aspects of filled pauses in a widely-used SWITCHBOARD [1] database are examined here, from the point of view of identifying them acoustically using a combination of duration, fundamental frequency and spectra.


