10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Named Entity Network Based on Wikipedia

Sameer Maskey (1), Wisam Dakka (2)

(1) IBM T.J. Watson Research Center, USA
(2) Google Inc., USA

Named Entities (NEs) play an important role in many natural language and speech processing tasks. A resource that identifies relations between NEs could potentially be very useful. We present such automatically generated knowledge resource from Wikipedia, Named Entity Network (NE-NET), that provides a list of related Named Entities (NEs) and the degree of relation for any given NE. Unlike some manually built knowledge resource, NE-NET has a wide coverage consisting of 1.5 million NEs represented as nodes of a graph with 6.5 million arcs relating them. NE-NET also provides the ranks of the related NEs using a simple ranking function that we propose. In this paper, we present NE-NET and our experiments showing how NE-NET can be used to improve the retrieval of spoken (Broadcast News) and text documents.

Full Paper

Bibliographic reference.  Maskey, Sameer / Dakka, Wisam (2009): "Named entity network based on wikipedia", In INTERSPEECH-2009, 1515-1518.