Method for storing data pages in data storage device using similarity-based data reduction
A method of storing a received page of data (202) in a data storage device (102) is provided. The method comprises (i) when the received page of data is received, obtaining a sample set comprising a set of samples, where the set of samples comprises two or more samples of the received page of data,...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A method of storing a received page of data (202) in a data storage device (102) is provided. The method comprises (i) when the received page of data is received, obtaining a sample set comprising a set of samples, where the set of samples comprises two or more samples of the received page of data, (ii) calculating a new hash value for each sample of the two or more samples, the method comprises the following steps: (i) receiving a received data page, (iii) identifying one or more page identifiers (302AA to 302NN, 404AA to 404NN) associated with one or more pre-computed hash values in a key value store (300), (iv) ranking the identified page identifiers by the number of times they are identified, (v) determining a similarity between the received data page and one or more pages corresponding to the one or more ranked identifiers, the similarity is measured by a plurality of matching data sub-strings, the sub-strings being a sequence of bytes in a block or page, (vi) processing the received page of data accordi |
---|