7th International Conference on Spoken Language Processing
September 16-20, 2002
In this paper we develop a maximum-entropy based method for annotating spontaneous conversational speech with punctuation. The goal of this task is to make automatic transcriptions more readable by humans, and to render them into a form that is useful for subsequent natural language processing and discourse analysis. Our basic approach is to view the insertion of punctuation as a form of tagging, in which words are tagged with appropriate punctuation, and to apply a maximum entropy tagger that uses both lexical and prosodic features. We present experimental results on Switchboard data with both reference transcriptions and transcriptions produced by a speech recognition system.
Bibliographic reference. Huang, Jing / Zweig, Geoffrey (2002): "Maximum entropy model for punctuation annotation from speech", In ICSLP-2002, 917-920.