Linking temporal records

Many data sets contain temporal records which span a long period of time; each record is associated with a time stamp and describes some aspects of a real-world en- tity at a particular time (e.g., author information in DBLP). In such cases, we often wish to identify records that describe the same e...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Frontiers of Computer Science 2012-06, Vol.6 (3), p.293-312
Hauptverfasser: LI, Pei, DONG, Xin Luna, MAURINO, Andrea, SRIVASTAVA, Divesh
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Many data sets contain temporal records which span a long period of time; each record is associated with a time stamp and describes some aspects of a real-world en- tity at a particular time (e.g., author information in DBLP). In such cases, we often wish to identify records that describe the same entity over time and so be able to perform interest- ing longitudinal data analysis. However, existing record link- age techniques ignore temporal information and fall short for temporal data. This article studies linking temporal records. First, we ap- ply time decay to capture the effect of elapsed time on entity value evolution. Second, instead of comparing each pair of records locally, we propose clustering methods that consider the time order of the records and make global decisions. Ex- perimental results show that our algorithms significantly out- perform traditional linkage methods on various temporal data sets.
ISSN:1673-7350
2095-2228
1673-7466
2095-2236
DOI:10.1007/s11704-012-2002-5