EUROSPEECH 2003 - INTERSPEECH 2003
This paper is about a system that extracts principal content words from speech-recognized transcripts of voicemail messages and classifies them into proper names, telephone numbers, dates/times and `other'. The short text summaries generated are suitable for mobile messaging applications. The system uses a set of classifiers to identify the summary words, with each word being identified by a vector of lexical and prosodic features. The features are selected using Parcel, an ROC-based algorithm. We visually compare the role of a large number of individual features and discuss effective ways to combine them. We finally evaluate their performance on manual and automatic transcriptions derived from two different speech recognition systems.
Bibliographic reference. Koumpis, Konstantinos / Renals, Steve (2003): "Multi-class extractive voicemail summarization", In EUROSPEECH-2003, 2785-2788.