Improving Restore Performance for In-Line Backup System Combining Deduplication and Delta Compression
Data deduplication, though being efficient in removing duplicate chunks, introduces chunk fragmentation which decreases restore performance. Rewriting algorithms are proposed to reduce the chunk fragmentation. Delta compression is often used as a complement for data deduplication to further improve...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on parallel and distributed systems 2020-10, Vol.31 (10), p.2302-2314 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Data deduplication, though being efficient in removing duplicate chunks, introduces chunk fragmentation which decreases restore performance. Rewriting algorithms are proposed to reduce the chunk fragmentation. Delta compression is often used as a complement for data deduplication to further improve storage efficiency. We observe that delta compression introduces a new type of chunk fragmentation stemming from improper delta compression for chunks of which the base chunks are fragmented. The new type of chunk fragmentation severely decreases restore performance and cannot be addressed by existing rewriting algorithms. To address this problem, we propose SDC, a scheme performing post-deduplication delta compression only for the chunks of which the bases can be directly found in the restore cache to eliminate additional disk reads for base chunks, thus avoiding the new type of chunk fragmentation. In addition, self-referenced chunks can be fragmented, which decrease restore performance, and these fragmented chunks can serve as bases to decrease the restore performance repeatedly. We propose a hybrid rewriting scheme for SDC to rewrite such fragmented chunks. Experimental results show that SDC improves the restore performance of the approach that directly performs delta compression after data deduplication by 2.9-16.9x, and achieves more than 95 percent of its compression gains. |
---|---|
ISSN: | 1045-9219 1558-2183 |
DOI: | 10.1109/TPDS.2020.2991030 |