A Survey of Person Name Disambiguation on the Web

Person name disambiguation on the Web (PNDW) consists of grouping the Web pages retrieved by a search engine when a person's name is queried according to the individuals they refer to. This problem is of interest to the research community because Internet users often search for information abou...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2018, Vol.6, p.59496-59514
Hauptverfasser: Delgado, Agustin D., Montalvo, Soto, Martinez Unanue, Raquel, Fresno, Victor
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Person name disambiguation on the Web (PNDW) consists of grouping the Web pages retrieved by a search engine when a person's name is queried according to the individuals they refer to. This problem is of interest to the research community because Internet users often search for information about people on search engines, and also because people's names are a very ambiguous type of named entity. In addition, the Web domain presents several challenges for natural language processing and information retrieval methods. In this paper, we classify PNDW systems according to their main characteristics: 1) features used to identify different individuals with the same name; 2) mathematical models used to represent the search results; 3) clustering algorithms used to group the Web pages; 4) methods used to address the impact of Web pages from social networking sites; and 5) methods used to deal with the multilingual nature of the Web. Also, we present the data sets most widely used to evaluate PNDW systems. Finally, we analyze the results obtained by the best PNDW systems in the literature.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2018.2874891