Load history calculation in internal stage replication

Embodiments of the present disclosure provide techniques for deduplicating files during internal stage replication using a directory table of the replicated internal stage that is modified as a cache for storing and retrieving original file-level metadata for the replicated files. An initial list of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Zhang, Yanrui, Han, Chong, Al Mahmood, Abdullah, Liang, Jiaxing, Iyer, Ganeshan Ramachandran, Mahesh, Nithin
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Embodiments of the present disclosure provide techniques for deduplicating files during internal stage replication using a directory table of the replicated internal stage that is modified as a cache for storing and retrieving original file-level metadata for the replicated files. An initial list of candidate files for loading from the internal stage to a table of the target deployment is prepared based on the files listed in the internal stage, and refined using a directory table lookup. If there is any inconsistency between the files registered in the directory table and the files listed in the internal stage, the target deployment will inspect the user-defined file-level metadata to obtain original file-level metadata for each file that is present in the internal stage but not in the directory table. This information may be used during deduplication to ensure that no duplicate files are loaded.