ISCA Archive Prosody 2001
ISCA Archive Prosody 2001

Can prosody aid the automatic processing of multi-party meetings? evidence from predicting punctuation, disuencies, and overlapping speech

Elizabeth Shriberg, Andreas Stolcke, Don Baron

We investigate whether probabilistic modeling of prosody can aid various automatic labeling tasks essential for processing of multi-party meetings. Task 1, automatic punctuation, seeks to classify sentence boundaries and disuencies. Task 2, jumpin points, predicts locations within foreground speech at which background speakers start talking; Task 3, jump-in words, examines characteristics of the speech they use to do so. Data are from the ICSI Meeting Recorder corpus. To infer inherent cues, analyses are based on close-talking microphone signals and recognizer forced alignments. As a generous baseline for word-level cues, we compare prosodic models to those of a language model given the true words. Results for Task 1 show prosody reduces classication error by 10% relative over the cheating language model; furthermore when this task is run in ìonlineî mode the prosodic model degrades less than does the language model. For Task 2, the language model provides no information, while the prosodic model reduces entropy by 13% over chance. For Task 3, a prosodic model reduces entropy by 25% over chance. Analyses also show interesting prosodic patterns, which differ over tasks. Task 1 uses cues similar to those for Switchboard (but not Broadcast News) data. Task 2 predicts jump-in points that look prosodically like sentence boundaries but that are not actually such boundaries. And Task 3 shows that speakers ìraiseî their voice when starting during another's talk, compared to starting during silence. These results provide evidence that prosodic modeling can be of use for the automatic processing of meetings. Further results and implications for future automatic meeting processing systems are discussed.


Cite as: Shriberg, E., Stolcke, A., Baron, D. (2001) Can prosody aid the automatic processing of multi-party meetings? evidence from predicting punctuation, disuencies, and overlapping speech. Proc. ITRW on Prosody in Speech Recognition and Understanding, paper 26

@inproceedings{shriberg01b_prosody,
  author={Elizabeth Shriberg and Andreas Stolcke and Don Baron},
  title={{Can prosody aid the automatic processing of multi-party meetings? evidence from predicting punctuation, disuencies, and overlapping speech}},
  year=2001,
  booktitle={Proc. ITRW on Prosody in Speech Recognition and Understanding},
  pages={paper 26}
}