Improving the Robustness of Distributed Failure Detectors in Adverse Conditions

Failure detection is at the core of most fault tolerance strategies, but it often depends on reliable communication. We present new algorithms for failure detectors which are appropriate as components of a fault tolerance system that can be deployed in situations of adverse network conditions (such...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Revista IEEE América Latina 2012-01, Vol.10 (1), p.1364-1369
Hauptverfasser: Lemos, F. T. C., Sato, L. M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1369
container_issue 1
container_start_page 1364
container_title Revista IEEE América Latina
container_volume 10
creator Lemos, F. T. C.
Sato, L. M.
description Failure detection is at the core of most fault tolerance strategies, but it often depends on reliable communication. We present new algorithms for failure detectors which are appropriate as components of a fault tolerance system that can be deployed in situations of adverse network conditions (such as loosely connected and administered computing grids). It packs redundancy into heartbeat messages, thereby improving on the robustness of the traditional protocols. Results from experimental tests conducted in a simulated environment with adverse network conditions show significant improvement over existing solutions.
doi_str_mv 10.1109/TLA.2012.6142485
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_920217571</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6142485</ieee_id><sourcerecordid>2580569421</sourcerecordid><originalsourceid>FETCH-LOGICAL-c206t-7759f90b11c20e35ca8845ce83167032e86ed02a8d3fe8453a5b7042ed5f21e63</originalsourceid><addsrcrecordid>eNpdkM1LAzEQxYMoWKt3wUvw5GVrPja72WNprRYKBannsB-zmtJuaiZb8L83pVXE08zwfvN4PEJuORtxzorH1WI8EoyLUcZTkWp1RgZcpTphRSHO_-yX5ApxzZjUmZYDspxvd97tbfdOwwfQV1f1GDpApK6lU4vB26oP0NBZaTe9BzqFAHVwHqnt6LjZg0egE9c1NljX4TW5aMsNws1pDsnb7Gk1eUkWy-f5ZLxIasGykOS5KtqCVZzHG6SqS61TVYOWPMuZFKAzaJgodSNbiIosVZWzVECjWsEhk0PycPSN6T97wGC2FmvYbMoOXI-GM15kkke3iN7_Q9eu911MZwrBBM9VziPEjlDtHaKH1uy83Zb-KzqZQ8EmFmwOBZtTwfHl7vhiAeAX_1G_AZQEdcA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>920217571</pqid></control><display><type>article</type><title>Improving the Robustness of Distributed Failure Detectors in Adverse Conditions</title><source>IEEE Electronic Library (IEL)</source><creator>Lemos, F. T. C. ; Sato, L. M.</creator><creatorcontrib>Lemos, F. T. C. ; Sato, L. M.</creatorcontrib><description>Failure detection is at the core of most fault tolerance strategies, but it often depends on reliable communication. We present new algorithms for failure detectors which are appropriate as components of a fault tolerance system that can be deployed in situations of adverse network conditions (such as loosely connected and administered computing grids). It packs redundancy into heartbeat messages, thereby improving on the robustness of the traditional protocols. Results from experimental tests conducted in a simulated environment with adverse network conditions show significant improvement over existing solutions.</description><identifier>ISSN: 1548-0992</identifier><identifier>EISSN: 1548-0992</identifier><identifier>DOI: 10.1109/TLA.2012.6142485</identifier><language>eng</language><publisher>Los Alamitos: IEEE</publisher><subject>Biomedical monitoring ; Detectors ; Distributed Failure Detectors ; Failure ; Failure Detection ; Fault tolerance ; Heart beat ; Messages ; Monitoring ; Networks ; Payloads ; Robustness ; Strategy</subject><ispartof>Revista IEEE América Latina, 2012-01, Vol.10 (1), p.1364-1369</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jan 2012</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6142485$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6142485$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Lemos, F. T. C.</creatorcontrib><creatorcontrib>Sato, L. M.</creatorcontrib><title>Improving the Robustness of Distributed Failure Detectors in Adverse Conditions</title><title>Revista IEEE América Latina</title><addtitle>T-LA</addtitle><description>Failure detection is at the core of most fault tolerance strategies, but it often depends on reliable communication. We present new algorithms for failure detectors which are appropriate as components of a fault tolerance system that can be deployed in situations of adverse network conditions (such as loosely connected and administered computing grids). It packs redundancy into heartbeat messages, thereby improving on the robustness of the traditional protocols. Results from experimental tests conducted in a simulated environment with adverse network conditions show significant improvement over existing solutions.</description><subject>Biomedical monitoring</subject><subject>Detectors</subject><subject>Distributed Failure Detectors</subject><subject>Failure</subject><subject>Failure Detection</subject><subject>Fault tolerance</subject><subject>Heart beat</subject><subject>Messages</subject><subject>Monitoring</subject><subject>Networks</subject><subject>Payloads</subject><subject>Robustness</subject><subject>Strategy</subject><issn>1548-0992</issn><issn>1548-0992</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkM1LAzEQxYMoWKt3wUvw5GVrPja72WNprRYKBannsB-zmtJuaiZb8L83pVXE08zwfvN4PEJuORtxzorH1WI8EoyLUcZTkWp1RgZcpTphRSHO_-yX5ApxzZjUmZYDspxvd97tbfdOwwfQV1f1GDpApK6lU4vB26oP0NBZaTe9BzqFAHVwHqnt6LjZg0egE9c1NljX4TW5aMsNws1pDsnb7Gk1eUkWy-f5ZLxIasGykOS5KtqCVZzHG6SqS61TVYOWPMuZFKAzaJgodSNbiIosVZWzVECjWsEhk0PycPSN6T97wGC2FmvYbMoOXI-GM15kkke3iN7_Q9eu911MZwrBBM9VziPEjlDtHaKH1uy83Zb-KzqZQ8EmFmwOBZtTwfHl7vhiAeAX_1G_AZQEdcA</recordid><startdate>201201</startdate><enddate>201201</enddate><creator>Lemos, F. T. C.</creator><creator>Sato, L. M.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>201201</creationdate><title>Improving the Robustness of Distributed Failure Detectors in Adverse Conditions</title><author>Lemos, F. T. C. ; Sato, L. M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c206t-7759f90b11c20e35ca8845ce83167032e86ed02a8d3fe8453a5b7042ed5f21e63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Biomedical monitoring</topic><topic>Detectors</topic><topic>Distributed Failure Detectors</topic><topic>Failure</topic><topic>Failure Detection</topic><topic>Fault tolerance</topic><topic>Heart beat</topic><topic>Messages</topic><topic>Monitoring</topic><topic>Networks</topic><topic>Payloads</topic><topic>Robustness</topic><topic>Strategy</topic><toplevel>online_resources</toplevel><creatorcontrib>Lemos, F. T. C.</creatorcontrib><creatorcontrib>Sato, L. M.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>Revista IEEE América Latina</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lemos, F. T. C.</au><au>Sato, L. M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Improving the Robustness of Distributed Failure Detectors in Adverse Conditions</atitle><jtitle>Revista IEEE América Latina</jtitle><stitle>T-LA</stitle><date>2012-01</date><risdate>2012</risdate><volume>10</volume><issue>1</issue><spage>1364</spage><epage>1369</epage><pages>1364-1369</pages><issn>1548-0992</issn><eissn>1548-0992</eissn><abstract>Failure detection is at the core of most fault tolerance strategies, but it often depends on reliable communication. We present new algorithms for failure detectors which are appropriate as components of a fault tolerance system that can be deployed in situations of adverse network conditions (such as loosely connected and administered computing grids). It packs redundancy into heartbeat messages, thereby improving on the robustness of the traditional protocols. Results from experimental tests conducted in a simulated environment with adverse network conditions show significant improvement over existing solutions.</abstract><cop>Los Alamitos</cop><pub>IEEE</pub><doi>10.1109/TLA.2012.6142485</doi><tpages>6</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1548-0992
ispartof Revista IEEE América Latina, 2012-01, Vol.10 (1), p.1364-1369
issn 1548-0992
1548-0992
language eng
recordid cdi_proquest_journals_920217571
source IEEE Electronic Library (IEL)
subjects Biomedical monitoring
Detectors
Distributed Failure Detectors
Failure
Failure Detection
Fault tolerance
Heart beat
Messages
Monitoring
Networks
Payloads
Robustness
Strategy
title Improving the Robustness of Distributed Failure Detectors in Adverse Conditions
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T21%3A17%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Improving%20the%20Robustness%20of%20Distributed%20Failure%20Detectors%20in%20Adverse%20Conditions&rft.jtitle=Revista%20IEEE%20Am%C3%A9rica%20Latina&rft.au=Lemos,%20F.%20T.%20C.&rft.date=2012-01&rft.volume=10&rft.issue=1&rft.spage=1364&rft.epage=1369&rft.pages=1364-1369&rft.issn=1548-0992&rft.eissn=1548-0992&rft_id=info:doi/10.1109/TLA.2012.6142485&rft_dat=%3Cproquest_RIE%3E2580569421%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=920217571&rft_id=info:pmid/&rft_ieee_id=6142485&rfr_iscdi=true