COORDINATING FAULT RECOVERY IN A DISTRIBUTED SYSTEM

In various embodiments, methods and systems for coordinating, between a host and a tenant, fault recovery of tenant infrastructure in a distributed system is provided. A fault occurrence is determined for a tenant infrastructure in the distributed system. The fault occurrence may be a software failu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Rizvi Murtuza, Almida Christopher P, Mani Ajat, Rafiq Wakkas, Nagesharao Pavithra Tyamagondlu, Hassan Akram M.H, Rewaskar Sushant Pramud
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In various embodiments, methods and systems for coordinating, between a host and a tenant, fault recovery of tenant infrastructure in a distributed system is provided. A fault occurrence is determined for a tenant infrastructure in the distributed system. The fault occurrence may be a software failure or hardware failure of the tenant infrastructure supporting a service application of the tenant. A fault recovery plan is communicated to the tenant to notify the tenant of the fault occurrence and actions taken to restore the tenant infrastructure. It is determined whether a fault recovery plan response is received from the tenant; the fault recovery plan response is an acknowledgement from the tenant of the fault recovery plan. Upon receiving the fault recovery plan response or at the expiration of a predefined time limit, the fault recovery plan is executed to restore the tenant infrastructure.