Block-level data de-duplication using thinly provisioned data storage volumes

Data segments are logically organized in groups in a data repository. Each segment is stored at an index in the data repository. In association with a write request, a hash algorithm is applied to the data segment to generate a group identifier. Each group is identifiable by a corresponding group id...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	POPOVSKI VLADIMIR, NAHUM NELSON
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Data segments are logically organized in groups in a data repository. Each segment is stored at an index in the data repository. In association with a write request, a hash algorithm is applied to the data segment to generate a group identifier. Each group is identifiable by a corresponding group identifier. The group identifier is applied to a hash tree to determine whether a corresponding group in the data repository exists. Each existing group in the data repository corresponds to a leaf of the hash tree. If no corresponding group exists in the data repository, the data segment is stored in a new group in the data repository. However, if a corresponding group exists, the group is further searched to determine if a data segment matching the data segment to be stored is already stored. The data segment can be stored in accordance with the results of the search.