Reliable Actors with Retry Orchestration
Cloud developers have to build applications that are resilient to failures and interruptions. We advocate for a fault-tolerant programming model for the cloud based on actors, retry orchestration, and tail calls. This model builds upon persistent data stores and messages queues readily available on...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Cloud developers have to build applications that are resilient to failures
and interruptions. We advocate for a fault-tolerant programming model for the
cloud based on actors, retry orchestration, and tail calls. This model builds
upon persistent data stores and messages queues readily available on the cloud.
Retry orchestration not only guarantees that (1) failed actor invocations will
be retried but also that (2) completed invocations are never repeated and (3)
it preserves a strict happen-before relationship across failures within call
stacks. Tail calls can break complex tasks into simple steps to minimize
re-execution during recovery. We review key application patterns and failure
scenarios. We formalize a process calculus to precisely capture the mechanisms
of fault tolerance in this model. We briefly describe our implementation. Using
an application inspired by a typical enterprise scenario, we validate the
functional correctness of our implementation and assess the impact of fault
preparedness and recovery on performance. |
---|---|
DOI: | 10.48550/arxiv.2111.11562 |