Block-level data de-duplication using thinly provisioned data storage volumes
Data segments are logically organized in groups in a data repository. Each segment is stored at an index in the data repository. In association with a write request, a hash algorithm is applied to the data segment to generate a group identifier. Each group is identifiable by a corresponding group id...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Data segments are logically organized in groups in a data repository. Each segment is stored at an index in the data repository. In association with a write request, a hash algorithm is applied to the data segment to generate a group identifier. Each group is identifiable by a corresponding group identifier. The group identifier is applied to a hash tree to determine whether a corresponding group in the data repository exists. Each existing group in the data repository corresponds to a leaf of the hash tree. If no corresponding group exists in the data repository, the data segment is stored in a new group in the data repository. However, if a corresponding group exists, the group is further searched to determine if a data segment matching the data segment to be stored is already stored. The data segment can be stored in accordance with the results of the search. |
---|