INTERSPEECH 2004 - ICSLP
Automatic extraction of key sentences from academic presentation speeches is addressed. The method makes use of the characteristic expressions used in initial utterances of sections, which are defined as discourse markers and derived in a totally unsupervised manner based on word statistics. The statistics of the discourse markers are then used to define the importance of the sentences. It is also combined with the conventional tf-idf measure of content words. Comprehensive evaluation using the Corpus of Spontaneous Japanese and a variety of experimental setups is presented in this paper. We carefully designed the evaluation scheme to be compared to human performance. The proposed method using the discourse markers shows consistent effectiveness in the key sentence extraction. Based on the indexing, we realize efficient browsing of lecture audio archives.
Bibliographic reference. Kitade, Tasuku / Kawahara, Tatsuya / Nanjo, Hiroaki (2004): "Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markers", In INTERSPEECH-2004, 2169-2172.