Optimizing the Performance of Virtual Machine Synchronization for Fault Tolerance

Hypervisor-based fault tolerance (HBFT), which synchronizes the state between the primary VM and the backup VM at a high frequency of tens to hundreds of milliseconds, is an emerging approach to sustaining mission-critical applications. Based on virtualization technology, HBFT provides an economic a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on computers 2011-12, Vol.60 (12), p.1718-1729
Hauptverfasser:	Zhu, Jun, Jiang, Zhefu, Xiao, Zhen, Li, Xiaoming
Format:	Artikel
Sprache:	eng
Schlagworte:	checkpoint Computer architecture Fault tolerance Fault tolerant systems hypervisor recovery Synchronization Virtual machine monitors Virtualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1729
container_issue	12
container_start_page	1718
container_title	IEEE transactions on computers
container_volume	60
creator	Zhu, Jun Jiang, Zhefu Xiao, Zhen Li, Xiaoming
description	Hypervisor-based fault tolerance (HBFT), which synchronizes the state between the primary VM and the backup VM at a high frequency of tens to hundreds of milliseconds, is an emerging approach to sustaining mission-critical applications. Based on virtualization technology, HBFT provides an economic and transparent fault tolerant solution. However, the advantages currently come at the cost of substantial performance overhead during failure-free, especially for memory intensive applications. This paper presents an in-depth examination of HBFT and options to improve its performance. Based on the behavior of memory accesses among checkpointing epochs, we introduce two optimizations, read-fault reduction and write-fault prediction, for the memory tracking mechanism. These two optimizations improve the performance by 31 percent and 21 percent, respectively, for some applications. Then, we present software superpage which efficiently maps large memory regions between virtual machines (VM). Our optimization improves the performance of HBFT by a factor of 1.4 to 2.2 and achieves about 60 percent of that of the native VM.
doi_str_mv	10.1109/TC.2010.224
format	Article
fullrecord	<record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TC_2010_224</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5629326</ieee_id><sourcerecordid>10_1109_TC_2010_224</sourcerecordid><originalsourceid>FETCH-LOGICAL-c253t-4b4abae6d2df376ea7b07d03bbfad4320bf0d96785df10f3d2cbffdd8f1377ea3</originalsourceid><addsrcrecordid>eNo9kMFLwzAYxYMoOKcnj15yl84vSZssRyluCpMpVq8lab7YSNeOtDtsf70dE0-PB7_3Dj9CbhnMGAP9UOQzDmPhPD0jE5ZlKtE6k-dkAsDmiRYpXJKrvv8BAMlBT8j7ejuETTiE9psONdI3jL6LG9NWSDtPv0Icdqahr6aqQ4v0Y99WdezacDBD6Fo6snRhds1Ai67BeJxdkwtvmh5v_nJKPhdPRf6crNbLl_xxlVQ8E0OS2tRYg9Jx54WSaJQF5UBY641LBQfrwWmp5pnzDLxwvLLeOzf3TCiFRkzJ_em3il3fR_TlNoaNifuSQXm0URZ5ebRRjjZG-u5EB0T8JzPJteBS_AKax10v</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Optimizing the Performance of Virtual Machine Synchronization for Fault Tolerance</title><source>IEEE Electronic Library (IEL)</source><creator>Zhu, Jun ; Jiang, Zhefu ; Xiao, Zhen ; Li, Xiaoming</creator><creatorcontrib>Zhu, Jun ; Jiang, Zhefu ; Xiao, Zhen ; Li, Xiaoming</creatorcontrib><description>Hypervisor-based fault tolerance (HBFT), which synchronizes the state between the primary VM and the backup VM at a high frequency of tens to hundreds of milliseconds, is an emerging approach to sustaining mission-critical applications. Based on virtualization technology, HBFT provides an economic and transparent fault tolerant solution. However, the advantages currently come at the cost of substantial performance overhead during failure-free, especially for memory intensive applications. This paper presents an in-depth examination of HBFT and options to improve its performance. Based on the behavior of memory accesses among checkpointing epochs, we introduce two optimizations, read-fault reduction and write-fault prediction, for the memory tracking mechanism. These two optimizations improve the performance by 31 percent and 21 percent, respectively, for some applications. Then, we present software superpage which efficiently maps large memory regions between virtual machines (VM). Our optimization improves the performance of HBFT by a factor of 1.4 to 2.2 and achieves about 60 percent of that of the native VM.</description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/TC.2010.224</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>IEEE</publisher><subject>checkpoint ; Computer architecture ; Fault tolerance ; Fault tolerant systems ; hypervisor ; recovery ; Synchronization ; Virtual machine monitors ; Virtualization</subject><ispartof>IEEE transactions on computers, 2011-12, Vol.60 (12), p.1718-1729</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c253t-4b4abae6d2df376ea7b07d03bbfad4320bf0d96785df10f3d2cbffdd8f1377ea3</citedby><cites>FETCH-LOGICAL-c253t-4b4abae6d2df376ea7b07d03bbfad4320bf0d96785df10f3d2cbffdd8f1377ea3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5629326$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5629326$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhu, Jun</creatorcontrib><creatorcontrib>Jiang, Zhefu</creatorcontrib><creatorcontrib>Xiao, Zhen</creatorcontrib><creatorcontrib>Li, Xiaoming</creatorcontrib><title>Optimizing the Performance of Virtual Machine Synchronization for Fault Tolerance</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description>Hypervisor-based fault tolerance (HBFT), which synchronizes the state between the primary VM and the backup VM at a high frequency of tens to hundreds of milliseconds, is an emerging approach to sustaining mission-critical applications. Based on virtualization technology, HBFT provides an economic and transparent fault tolerant solution. However, the advantages currently come at the cost of substantial performance overhead during failure-free, especially for memory intensive applications. This paper presents an in-depth examination of HBFT and options to improve its performance. Based on the behavior of memory accesses among checkpointing epochs, we introduce two optimizations, read-fault reduction and write-fault prediction, for the memory tracking mechanism. These two optimizations improve the performance by 31 percent and 21 percent, respectively, for some applications. Then, we present software superpage which efficiently maps large memory regions between virtual machines (VM). Our optimization improves the performance of HBFT by a factor of 1.4 to 2.2 and achieves about 60 percent of that of the native VM.</description><subject>checkpoint</subject><subject>Computer architecture</subject><subject>Fault tolerance</subject><subject>Fault tolerant systems</subject><subject>hypervisor</subject><subject>recovery</subject><subject>Synchronization</subject><subject>Virtual machine monitors</subject><subject>Virtualization</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kMFLwzAYxYMoOKcnj15yl84vSZssRyluCpMpVq8lab7YSNeOtDtsf70dE0-PB7_3Dj9CbhnMGAP9UOQzDmPhPD0jE5ZlKtE6k-dkAsDmiRYpXJKrvv8BAMlBT8j7ejuETTiE9psONdI3jL6LG9NWSDtPv0Icdqahr6aqQ4v0Y99WdezacDBD6Fo6snRhds1Ai67BeJxdkwtvmh5v_nJKPhdPRf6crNbLl_xxlVQ8E0OS2tRYg9Jx54WSaJQF5UBY641LBQfrwWmp5pnzDLxwvLLeOzf3TCiFRkzJ_em3il3fR_TlNoaNifuSQXm0URZ5ebRRjjZG-u5EB0T8JzPJteBS_AKax10v</recordid><startdate>20111201</startdate><enddate>20111201</enddate><creator>Zhu, Jun</creator><creator>Jiang, Zhefu</creator><creator>Xiao, Zhen</creator><creator>Li, Xiaoming</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20111201</creationdate><title>Optimizing the Performance of Virtual Machine Synchronization for Fault Tolerance</title><author>Zhu, Jun ; Jiang, Zhefu ; Xiao, Zhen ; Li, Xiaoming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c253t-4b4abae6d2df376ea7b07d03bbfad4320bf0d96785df10f3d2cbffdd8f1377ea3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>checkpoint</topic><topic>Computer architecture</topic><topic>Fault tolerance</topic><topic>Fault tolerant systems</topic><topic>hypervisor</topic><topic>recovery</topic><topic>Synchronization</topic><topic>Virtual machine monitors</topic><topic>Virtualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhu, Jun</creatorcontrib><creatorcontrib>Jiang, Zhefu</creatorcontrib><creatorcontrib>Xiao, Zhen</creatorcontrib><creatorcontrib>Li, Xiaoming</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhu, Jun</au><au>Jiang, Zhefu</au><au>Xiao, Zhen</au><au>Li, Xiaoming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimizing the Performance of Virtual Machine Synchronization for Fault Tolerance</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2011-12-01</date><risdate>2011</risdate><volume>60</volume><issue>12</issue><spage>1718</spage><epage>1729</epage><pages>1718-1729</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract>Hypervisor-based fault tolerance (HBFT), which synchronizes the state between the primary VM and the backup VM at a high frequency of tens to hundreds of milliseconds, is an emerging approach to sustaining mission-critical applications. Based on virtualization technology, HBFT provides an economic and transparent fault tolerant solution. However, the advantages currently come at the cost of substantial performance overhead during failure-free, especially for memory intensive applications. This paper presents an in-depth examination of HBFT and options to improve its performance. Based on the behavior of memory accesses among checkpointing epochs, we introduce two optimizations, read-fault reduction and write-fault prediction, for the memory tracking mechanism. These two optimizations improve the performance by 31 percent and 21 percent, respectively, for some applications. Then, we present software superpage which efficiently maps large memory regions between virtual machines (VM). Our optimization improves the performance of HBFT by a factor of 1.4 to 2.2 and achieves about 60 percent of that of the native VM.</abstract><pub>IEEE</pub><doi>10.1109/TC.2010.224</doi><tpages>12</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0018-9340
ispartof	IEEE transactions on computers, 2011-12, Vol.60 (12), p.1718-1729
issn	0018-9340 1557-9956
language	eng
recordid	cdi_crossref_primary_10_1109_TC_2010_224
source	IEEE Electronic Library (IEL)
subjects	checkpoint Computer architecture Fault tolerance Fault tolerant systems hypervisor recovery Synchronization Virtual machine monitors Virtualization
title	Optimizing the Performance of Virtual Machine Synchronization for Fault Tolerance
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T00%3A21%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimizing%20the%20Performance%20of%20Virtual%20Machine%20Synchronization%20for%20Fault%20Tolerance&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Zhu,%20Jun&rft.date=2011-12-01&rft.volume=60&rft.issue=12&rft.spage=1718&rft.epage=1729&rft.pages=1718-1729&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/TC.2010.224&rft_dat=%3Ccrossref_RIE%3E10_1109_TC_2010_224%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5629326&rfr_iscdi=true