Exploring Word Mover’s Distance and Semantic-Aware Embedding Techniques for Extractive Broadcast News Summarization

Shih-Hung Liu, Kuan-Yu Chen, Yu-Lun Hsieh, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu


Extractive summarization is a process that manages to select the most salient sentences from a document (or a set of documents) and subsequently assemble them to form an informative summary, facilitating users to browse and assimilate the main theme of the document efficiently. Our work in this paper continues this general line of research and its main contributions are two-fold. First, we explore to leverage the recently proposed word mover’s distance (WMD) metric, in conjunction with semantic-aware continuous space representations of words, to authentically capture finer-grained sentence-to-document and/or sentence-to-sentence semantic relatedness for effective use in the summarization process. Second, we investigate to combine our proposed approach with several state-of-the-art summarization methods, which originally adopted the conventional term-overlap or bag-of-words (BOW) approaches for similarity calculation. A series of experiments conducted on a typical broadcast news summarization task seem to suggest the performance merits of our proposed approach, in comparison to the mainstream methods.


DOI: 10.21437/Interspeech.2016-710

Cite as

Liu, S., Chen, K., Hsieh, Y., Chen, B., Wang, H., Yen, H., Hsu, W. (2016) Exploring Word Mover’s Distance and Semantic-Aware Embedding Techniques for Extractive Broadcast News Summarization. Proc. Interspeech 2016, 670-674.

Bibtex
@inproceedings{Liu+2016,
author={Shih-Hung Liu and Kuan-Yu Chen and Yu-Lun Hsieh and Berlin Chen and Hsin-Min Wang and Hsu-Chun Yen and Wen-Lian Hsu},
title={Exploring Word Mover’s Distance and Semantic-Aware Embedding Techniques for Extractive Broadcast News Summarization},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-710},
url={http://dx.doi.org/10.21437/Interspeech.2016-710},
pages={670--674}
}