Disambiguating Authors by Pairwise Classification

Name ambiguity is a critical problem in many applications, in particular in online bibliography sys-tems, such as DBLP, ACM, and CiteSeerx. Despite the many studies, this problem is still not resolved and is becoming even more serious, especially with the increasing popularity of Web 2.0. This paper...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Tsinghua science and technology 2010-12, Vol.15 (6), p.668-677
1. Verfasser: 林泉 王波 杜圆 王雪至 李玉华 陈松灿
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Name ambiguity is a critical problem in many applications, in particular in online bibliography sys-tems, such as DBLP, ACM, and CiteSeerx. Despite the many studies, this problem is still not resolved and is becoming even more serious, especially with the increasing popularity of Web 2.0. This paper addresses the problem in the academic researcher social network ArnetMiner using a supervised method for exploiting all side information including co-author, organization, paper citation, title similarity, author's homepage, web constraint, and user feedback. The method automatically determines the person number k. Tests on the researcher social network with up to 100 different names show that the method significantly outperforms the baseline method using an unsupervised attribute-augmented graph clustering algorithm.
ISSN:1007-0214
1878-7606
1007-0214
DOI:10.1016/S1007-0214(10)70114-0