VMckpt: lightweight and live virtual machine checkpointing

Recent advance of virtualization technology provides a new approach to check-point/restart at the virtual machine (VM) level. In contrast to traditional process-level checkpointing, checkpointing at the vir- tualization layer brings up several advantages, such as compatibility, transparence, flexibi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Science China. Information sciences 2012-12, Vol.55 (12), p.2865-2880
Hauptverfasser: Liu, HaiKun, Jin, Hai, Liao, XiaoFei, Ma, Bo, Xu, ChengZhong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recent advance of virtualization technology provides a new approach to check-point/restart at the virtual machine (VM) level. In contrast to traditional process-level checkpointing, checkpointing at the vir- tualization layer brings up several advantages, such as compatibility, transparence, flexibility and simplicity. ttowever~ because the virtualization layer has little semantic knowledge about the operation system and the applications running atop, VM-layer checkpointing requires saving the entire operating system state rather than a single process. The overhead may render the approach impractical. To reduce the size of VM checkpoint, in this paper we propose a page eviction scheme and an incremental checkpointing mechanism to avoid saving un- necessary VM pages in the checkpoint. To keep the system online transparently, we propose a live checkpointing mechanism by saving the memory image in a copy-on-write (COW) manner. We implement the performance optimization mechanisms in a prototype system, called VMckpt. Experimental results with a group of represen- tative applications show that our page eviction scheme and incremental checkpointing can significantly reduce the checkpoint file size by up to 87% and shorten the total checkpointing/restart time by a factor of up to 71%, in comparison with the Xens default cheekpointing mechanism. The observed application downtimes due to eheckpointing can be reduced to as small as 300 ms.
ISSN:1674-733X
1869-1919
DOI:10.1007/s11432-011-4501-7