COORDINATING FAULT RECOVERY IN A DISTRIBUTED SYSTEM
In various embodiments, methods and systems for coordinating, between a host and a tenant, fault recovery of tenant infrastructure in a distributed system is provided. A fault occurrence is determined for a tenant infrastructure in the distributed system. The fault occurrence may be a software failu...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In various embodiments, methods and systems for coordinating, between a host and a tenant, fault recovery of tenant infrastructure in a distributed system is provided. A fault occurrence is determined for a tenant infrastructure in the distributed system. The fault occurrence may be a software failure or hardware failure of the tenant infrastructure supporting a service application of the tenant. A fault recovery plan is communicated to the tenant to notify the tenant of the fault occurrence and actions taken to restore the tenant infrastructure. It is determined whether a fault recovery plan response is received from the tenant; the fault recovery plan response is an acknowledgement from the tenant of the fault recovery plan. Upon receiving the fault recovery plan response or at the expiration of a predefined time limit, the fault recovery plan is executed to restore the tenant infrastructure. |
---|