9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Unsupervised Learning of Edit Parameters for Matching Name Variants

Dan Gillick (1), Dilek Hakkani-Tür (2), Michael Levit (2)

(1) University of California at Berkeley, USA; (2) ICSI, USA

Since named entities are often written in different ways, question answering (QA) and other language processing tasks stand to benefit from entity matching. We address the problem of finding equivalent person names in unstructured text. Our approach is a generalization of spelling correction: We compare to candidate matches by applying a set of edits to an input name. We introduce a novel unsupervised method for learning spelling edit probabilities which improves overall F-Measure on our own name-matching task by 12%. Relevance is demonstrated by application to the GALE Distillation task.

Full Paper

Bibliographic reference.  Gillick, Dan / Hakkani-Tür, Dilek / Levit, Michael (2008): "Unsupervised learning of edit parameters for matching name variants", In INTERSPEECH-2008, 467-470.