RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups
Scaling up the backup storage for an ever-increasing volume of virtual machine (VM) images is a critical issue in virtualization environments. While deduplication is known to effectively eliminate duplicates for VM image storage, it also introduces fragmentation that will degrade read performance. W...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Scaling up the backup storage for an ever-increasing volume of virtual
machine (VM) images is a critical issue in virtualization environments. While
deduplication is known to effectively eliminate duplicates for VM image
storage, it also introduces fragmentation that will degrade read performance.
We propose RevDedup, a deduplication system that optimizes reads to latest VM
image backups using an idea called reverse deduplication. In contrast with
conventional deduplication that removes duplicates from new data, RevDedup
removes duplicates from old data, thereby shifting fragmentation to old data
while keeping the layout of new data as sequential as possible. We evaluate our
RevDedup prototype using microbenchmark and real-world workloads. For a 12-week
span of real-world VM images from 160 users, RevDedup achieves high
deduplication efficiency with around 97% of saving, and high backup and read
throughput on the order of 1GB/s. RevDedup also incurs small metadata overhead
in backup/read operations. |
---|---|
DOI: | 10.48550/arxiv.1302.0621 |