Towards highly-concurrent leaderless state machine replication for distributed systems

State Machine Replication (SMR) is a fault-tolerant service implementation technique used by many modern Internet services. A single leader is used in classic SMR to order all state machine commands. Due to the scalability and availability difficulties of the single-leader approach, recent protocols...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of systems architecture 2022-06, Vol.127, p.102516, Article 102516
Hauptverfasser: Wang, Weilue, Tan, Yujuan, Wu, Changze, Liu, Duo, Wu, Yu, Luo, Longpan, Chen, Xianzhang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:State Machine Replication (SMR) is a fault-tolerant service implementation technique used by many modern Internet services. A single leader is used in classic SMR to order all state machine commands. Due to the scalability and availability difficulties of the single-leader approach, recent protocols propose a leaderless technique in which each replica can make progress using a quorum of replicas. While the leaderless strategy is gaining traction, it necessitates all replicas serializing a directed graph with the precise specification, which usually results in sequential execution. When employing popular multicore servers, sequential execution also limits performance. We propose an efficient scheduler termed CCDG (Concurrent Construct Dependency Graphs) for Leaderless State Machine Replication to increase parallelization to fully exploit multicore capabilities and boost performance. To reach higher parallelism levels and make better use of multicore technology, CCDG provides concurrent construction of dependency graphs with guaranteed linearizability. Meanwhile, CCDG improves tail latency by eliminating unnecessary dependencies to reduce scheduling wait times. Our extensive experimental study shows that CCDG achieves up to 3.3 times the throughput in workloads with conflicting commands compared to EPaxos, one of the most popular leaderless SMR consensus protocols.
ISSN:1383-7621
1873-6165
DOI:10.1016/j.sysarc.2022.102516