Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
This paper describes an acoustic analysis and perceptual evaluation of the prosodic structure of a spontaneously produced monologue. It was found that a speaker can demarcate larger-scale topical units in spoken discourse by means of intonation (use of melodic boundary markers, scaling of maxima in pitch movements, general decline in average F0) and temporal structure, i.e. by the use of pauses with variable durations. In a perception test, it was examined to what extent this prosodic structure may be important to listeners. Subjects were confronted with three unintelligible (band-pass filtered) versions of a fragment of the elicited monologue: (1) with the original prosody unchanged; (2) with constant pause duration and the original speech melody; (3) with monotonous pitch and the original pause structure. They were instructed to indicate the boundaries of the larger-scale topical units in the three versions. Subjects were able to detect correctly the major discourse boundaries in all three filtered versions in a significant amount of cases. They performed best when confronted with version 1. Version 2 in its turn did better then version 3, which suggests that intonation is a perceptually more important factor than pause structure for the clarification of the thematic make-up of a text, though the latter dimension is certainly not negligible.
Bibliographic reference. Swerts, Marc / Geluykens, Ronald / Terken, Jacques (1992): "Prosodic correlates of discourse units in spontaneous speech", In ICSLP-1992, 421-424.