7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Automatic Punctuation and Disfluency Detection in Multi-Party Meetings Using Prosodic and Lexical Cues

Don Baron, Elizabeth Shriberg, Andreas Stolcke

International Computer Science Institute, USA

We investigate automatic approaches to finding "hidden" spontaneous speech events, such as sentence boundaries and disfluencies, in multi-party meetings. Hidden events are characterized prosodically by a large array of automatically extracted energy, duration, and pitch features, and are modeled by decision tree classifiers; lexical cues are modeled by N-gram language models. Both sources of information are combined in a hidden Markov model framework. Results show that combined classifiers achieve higher accuracy than either single knowledge source alone. We also study classifiers that use only the preceding context for predicting events, simulating online processing. We find that prosodic features are more robust than are language model features to this constraint. Finally, we examine the effect of automatic word recognition errors, in both training and testing, on classification accuracy. We find that lexical models degrade much more severely than do prosodic models in this case, again showing the relative robustness of prosodic information for hidden-event detection in natural conversation.

Full Paper

Bibliographic reference.  Baron, Don / Shriberg, Elizabeth / Stolcke, Andreas (2002): "Automatic punctuation and disfluency detection in multi-party meetings using prosodic and lexical cues", In ICSLP-2002, 949-952.