EUROSPEECH 2003 - INTERSPEECH 2003
Recent interest in the automatic processing of meetings is motivated by a desire to summarize, browse, and retrieve important information from lengthy archives of spoken data. One of the most useful capabilities such a technology could provide is a way for users to locate "hot spots" or regions in which participants are highly involved in the discussion (e.g. heated arguments, points of excitement, etc.). We ask two questions about hot spots in meetings in the ICSI Meeting Recorder corpus. First, we ask whether involvement can be judged reliably by human listeners. Results show that despite the subjective nature of the task, raters show significant agreement in distinguishing involved from non-involved utterances. Second, we ask whether there is a relationship between human judgments of involvement and automatically extracted prosodic features of the associated regions. Results show that there are significant differences in both F0 and energy between involved and non-involved utterances. These findings suggest that humans do agree to some extent on the judgment of hot spots, and that acoustic-only cues could be used for automatic detection of hot spots in natural meetings.
Bibliographic reference. Wrede, Britta / Shriberg, Elizabeth (2003): "Spotting "hot spots" in meetings: human judgments and prosodic cues", In EUROSPEECH-2003, 2805-2808.