Optimal object state transfer - recovery policies for fault tolerant distributed systems
Recent developments in the field of object-based fault tolerance and the advent of the first OMG FT-CORBA compliant middleware raise new requirements for the design process of distributed fault-tolerant systems. In this work, we introduce a simulation-based design approach based on the optimum effec...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent developments in the field of object-based fault tolerance and the advent of the first OMG FT-CORBA compliant middleware raise new requirements for the design process of distributed fault-tolerant systems. In this work, we introduce a simulation-based design approach based on the optimum effectiveness of the compared fault tolerance schemes. Each scheme is defined as a set of fault tolerance properties for the objects that compose the system. Its optimum effectiveness is determined by the tightest effective checkpoint intervals, for the passively replicated objects. Our approach allows mixing miscellaneous fault tolerance policies, as opposed to the published analytic models, which are best suited in the evaluation of single-server process replication schemes. Special emphasis has been given to the accuracy of the generated estimates using an appropriate simulation output analysis procedure. We provide showcase results and compare two characteristic warm passive replication schemes: one with periodic and another one with load-dependent object state checkpoints. Finally, a trade-off analysis is applied, for determining appropriate checkpoint properties, in respect to a specified design goal. |
---|---|
DOI: | 10.1109/DSN.2004.1311947 |