Selecting candidate rows for deduplication

The present invention extends to methods, systems, and computer program products for selecting candidate records for deduplication from a table. A table can be processed to compute an inverse index for each field of the table. A deduplication algorithm can traverse the inverse indices in accordance...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: HUDIS EFIM, PELEG GAD, ZINAR YARON, ORLIN YIFAT, GUREVICH YURI, NOVIK GAL
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The present invention extends to methods, systems, and computer program products for selecting candidate records for deduplication from a table. A table can be processed to compute an inverse index for each field of the table. A deduplication algorithm can traverse the inverse indices in accordance with a flexible user-defined policy to identify candidate records for deduplication. Both exact matches and approximate matches can be found.