Extractive summarization, with the intention of automatically selecting a set of representative sentences from a text (or spoken) document so as to concisely express the most important theme of the document, has been an active area of experimentation and development. A recent trend of research is to employ the language modeling (LM) approach for important sentence selection, which has proven to be effective for performing extractive summarization in an unsupervised fashion. However, one of the major challenges facing the LM approach is how to formulate the sentence models and estimate their parameters more accurately for each text (or spoken) document to be summarized. This paper extends this line of research and its contributions are three-fold. First, we propose a positional language modeling framework using different granularities of position-specific information to better estimate the sentence models involved in summarization. Second, we also explore to integrate the positional cues into relevance modeling through a pseudo-relevance feedback procedure. Third, the utilities of the various methods originated from our proposed framework and several well-established unsupervised methods are analyzed and compared extensively. Empirical evaluations conducted on a broadcast news summarization task seem to demonstrate the performance merits of our summarization methods.
Bibliographic reference. Liu, Shih-Hung / Chen, Kuan-Yu / Chen, Berlin / Wang, Hsin-Min / Yen, Hsu-Chun / Hsu, Wen-Lian (2015): "Positional language modeling for extractive broadcast news speech summarization", In INTERSPEECH-2015, 2729-2733.