Method for checking and processing repeated data
The invention discloses a method for checking and processing repeated data. The method comprises the following steps: A, acquiring data to be verified, and initializing the data structure of the data to be verified; B, calculating the hash code of each datum in the data to be verified; C, checking w...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a method for checking and processing repeated data. The method comprises the following steps: A, acquiring data to be verified, and initializing the data structure of the data to be verified; B, calculating the hash code of each datum in the data to be verified; C, checking whether repeated data exist among the data or not according to the hash code of each datum, and updating a tag code of each datum according to a checking result; D, transmitting each datum of which the tag code is updated to each distributed calculating node in order to determine whether repeated data exist between each datum of which the tag code is updated and local data through each distributed calculating node; E, transmitting each datum compared by each distributed calculating node to a summarizing node. By adopting the method, the comparison time of massive data can be shortened, and the data lookup and cleaning efficiencies are increased. |
---|