Consolidating identities in anonymous ego-centred collaboration networks

Abstract Individuals often appear with multiple names when considering large datasets collected from different sources, giving rise to the name ambiguities. Classical techniques that tackle this problem leverage personal information such as names and institutions. However, as privacy concerns contin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of complex networks 2021-02, Vol.9 (1)
Hauptverfasser: Gomide, Janaina, Kling, Hugo, Figueiredo, Daniel
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Abstract Individuals often appear with multiple names when considering large datasets collected from different sources, giving rise to the name ambiguities. Classical techniques that tackle this problem leverage personal information such as names and institutions. However, as privacy concerns continues to rise, Personally Identifiable Information (PII) may not be available in publicly released data. This work considers the synonym name ambiguity problem in anonymous ego-centred collaboration networks. The ego-centred collaboration network is generated from the individual’s profile and stripped of all PII. Using just the anonymous network, and no other side information, we propose an algorithm based on dominating sets to identify the different nodes that corresponds to the profile owner (synonyms). The proposed approach is applied to different datasets originating from profiles in DBLP and Google Scholar, showing an a relative high precision (e.g. 75% of profiles were perfectly mapped). This methodology indicates that ambiguous ego-centred networks have enough structural information to correctly identify synonyms of the individual.
ISSN:2051-1310
2051-1329
DOI:10.1093/comnet/cnab013