COORDINATING FAULT RECOVERY IN A DISTRIBUTED SYSTEM

In various embodiments, methods and systems for coordinating, between a host and a tenant, fault recovery of tenant infrastructure in a distributed system is provided. A fault occurrence is determined for a tenant infrastructure in the distributed system. The fault occurrence may be a software failu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: NAGESHARAO, Pavithra Tyamagondlu, MANI, Ajay, RIZVI, Murtuza, RAFIQ, Wakkas, REWASKAR, Sushant Pramod, ALMIDA, Christopher P, HASSAN, Akram M.H
Format: Patent
Sprache:eng ; fre ; ger
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In various embodiments, methods and systems for coordinating, between a host and a tenant, fault recovery of tenant infrastructure in a distributed system is provided. A fault occurrence is determined for a tenant infrastructure in the distributed system. The fault occurrence may be a software failure or hardware failure of the tenant infrastructure supporting a service application of the tenant. A fault recovery plan is communicated to the tenant to notify the tenant of the fault occurrence and actions taken to restore the tenant infrastructure. It is determined whether a fault recovery plan response is received from the tenant; the fault recovery plan response is an acknowledgement from the tenant of the fault recovery plan. Upon receiving the fault recovery plan response or at the expiration of a predefined time limit, the fault recovery plan is executed to restore the tenant infrastructure.