Method for checking and processing repeated data

The invention discloses a method for checking and processing repeated data. The method comprises the following steps: A, acquiring data to be verified, and initializing the data structure of the data to be verified; B, calculating the hash code of each datum in the data to be verified; C, checking w...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: XIONG DAOYONG, LONG QINGLIN, CHEN CHENGZHI, LIANG GUOHUI, LI AIMIN
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a method for checking and processing repeated data. The method comprises the following steps: A, acquiring data to be verified, and initializing the data structure of the data to be verified; B, calculating the hash code of each datum in the data to be verified; C, checking whether repeated data exist among the data or not according to the hash code of each datum, and updating a tag code of each datum according to a checking result; D, transmitting each datum of which the tag code is updated to each distributed calculating node in order to determine whether repeated data exist between each datum of which the tag code is updated and local data through each distributed calculating node; E, transmitting each datum compared by each distributed calculating node to a summarizing node. By adopting the method, the comparison time of massive data can be shortened, and the data lookup and cleaning efficiencies are increased.