Egida: an extensible toolkit for low-overhead fault-tolerance

We discuss the design and implementation of Egida, an object-oriented toolkit designed to support transparent rollback-recovery. Egida exports a simple specification language that can be used to express arbitrary rollback recovery protocols. From this specification, Egida automatically synthesizes a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Rao, S., Alvisi, L., Vin, H.M.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We discuss the design and implementation of Egida, an object-oriented toolkit designed to support transparent rollback-recovery. Egida exports a simple specification language that can be used to express arbitrary rollback recovery protocols. From this specification, Egida automatically synthesizes an implementation of the specified protocol by gluing together the appropriate objects from an available library of "building blocks". Egida is extensible and facilitates rapid implementation of rollback recovery protocols with minimal programming effort. We have integrated Egida with the MPICH implementation of the MPI standard. Existing MPI applications can rake advantage of Egida without any modifications: fault-tolerance is achieved transparently-all that is needed is a simple re-link of the MPI application with Egida.
ISSN:0731-3071
2375-124X
DOI:10.1109/FTCS.1999.781033