ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Cross-lingual sentence extraction for information distillation

Adish Kumar Singla, Dilek Hakkani-Tür

Information distillation aims to analyze and interpret large volumes of speech and text archives in multiple languages and produce structured information of interest to the user. In this work, we investigate cross-lingual information distillation, where non-English (source language) documents are searched for user queries that are in English (target language). We propose to perform distillation both on the original source language data and their English translations output by machine translation, and combine the two outputs. We experimentally show that combination approach results in 8% to 16% absolute (13% to 31% relative) F-measure improvement over the previous work.

doi: 10.21437/Interspeech.2008-671

Cite as: Singla, A.K., Hakkani-Tür, D. (2008) Cross-lingual sentence extraction for information distillation. Proc. Interspeech 2008, 2707-2710, doi: 10.21437/Interspeech.2008-671

  author={Adish Kumar Singla and Dilek Hakkani-Tür},
  title={{Cross-lingual sentence extraction for information distillation}},
  booktitle={Proc. Interspeech 2008},