A Survey of Person Name Disambiguation on the Web
Person name disambiguation on the Web (PNDW) consists of grouping the Web pages retrieved by a search engine when a person's name is queried according to the individuals they refer to. This problem is of interest to the research community because Internet users often search for information abou...
Gespeichert in:
Veröffentlicht in: | IEEE access 2018, Vol.6, p.59496-59514 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Person name disambiguation on the Web (PNDW) consists of grouping the Web pages retrieved by a search engine when a person's name is queried according to the individuals they refer to. This problem is of interest to the research community because Internet users often search for information about people on search engines, and also because people's names are a very ambiguous type of named entity. In addition, the Web domain presents several challenges for natural language processing and information retrieval methods. In this paper, we classify PNDW systems according to their main characteristics: 1) features used to identify different individuals with the same name; 2) mathematical models used to represent the search results; 3) clustering algorithms used to group the Web pages; 4) methods used to address the impact of Web pages from social networking sites; and 5) methods used to deal with the multilingual nature of the Web. Also, we present the data sets most widely used to evaluate PNDW systems. Finally, we analyze the results obtained by the best PNDW systems in the literature. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2018.2874891 |