Consolidating identities in anonymous ego-centred collaboration networks
Abstract Individuals often appear with multiple names when considering large datasets collected from different sources, giving rise to the name ambiguities. Classical techniques that tackle this problem leverage personal information such as names and institutions. However, as privacy concerns contin...
Gespeichert in:
Veröffentlicht in: | Journal of complex networks 2021-02, Vol.9 (1) |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Abstract
Individuals often appear with multiple names when considering large datasets collected from different sources, giving rise to the name ambiguities. Classical techniques that tackle this problem leverage personal information such as names and institutions. However, as privacy concerns continues to rise, Personally Identifiable Information (PII) may not be available in publicly released data. This work considers the synonym name ambiguity problem in anonymous ego-centred collaboration networks. The ego-centred collaboration network is generated from the individual’s profile and stripped of all PII. Using just the anonymous network, and no other side information, we propose an algorithm based on dominating sets to identify the different nodes that corresponds to the profile owner (synonyms). The proposed approach is applied to different datasets originating from profiles in DBLP and Google Scholar, showing an a relative high precision (e.g. 75% of profiles were perfectly mapped). This methodology indicates that ambiguous ego-centred networks have enough structural information to correctly identify synonyms of the individual. |
---|---|
ISSN: | 2051-1310 2051-1329 |
DOI: | 10.1093/comnet/cnab013 |