SYSTEM AND METHOD FOR RAPID ESTIMATION OF DATA SIMILARITY
Systems and methods for estimating data similarity between an inserted volume of data and a stored volume of data during file backup of a deduplicated data store when the ancestry of the inserted data to previously-stored data is unknown to identify an ancestor of the inserted volume of data in the...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Systems and methods for estimating data similarity between an inserted volume of data and a stored volume of data during file backup of a deduplicated data store when the ancestry of the inserted data to previously-stored data is unknown to identify an ancestor of the inserted volume of data in the stored volume so that only incremental data of the inserted volume is stored, the systems and methods comprising ingesting a volume of data, creating a subset of bits for the ingested volume using a filtering process, creating a subset of bits for each volume of stored data using the filtering process, comparing the subset of bits for the ingested volume with the subset of bits for each of the stored volumes, and determining the subset of bits for a stored volume with the most bits in common with the subset of bits for the ingested volume. |
---|