Extractive multi-document summarization is the task of choosing the sentences from documents to compose a summary text in response to a user query. We propose a generative approach to explicitly identify summary and non-summary topic distributions in document cluster sentences. Using these approximate summary topic probabilities as latent output variables, we build a discriminative classifier model. The sentences in new document clusters are inferred using the trained model. In our experiments we find that the proposed summarization approach is effective in comparison to the state-of-the-art methods.
Bibliographic reference. Celikyilmaz, Asli / Hakkani-Tür, Dilek (2010): "Extractive summarization using a latent variable model", In INTERSPEECH-2010, 2526-2529.