DrTM+B: Replication-Driven Live Reconfiguration for Fast and General Distributed Transaction Processing

Recent in-memory database systems leverage advanced hardware features like RDMA to provide transaction processing at millions of transactions per second. Distributed transaction processing systems can scale to even higher rates, especially for partitionable workloads. Unfortunately, it is challengin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems 2022-10, Vol.33 (10), p.2628-2643
Hauptverfasser: Shen, Sijie, Wei, Xingda, Chen, Rong, Chen, Haibo, Zang, Binyu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recent in-memory database systems leverage advanced hardware features like RDMA to provide transaction processing at millions of transactions per second. Distributed transaction processing systems can scale to even higher rates, especially for partitionable workloads. Unfortunately, it is challenging to sustain such high rates during live reconfiguration of partitions. In this article, we observe that state-of-the-art approaches would cause notable performance disruption under fast transaction processing. To this end, this article presents DrTM+B, a live reconfiguration approach that seamlessly repartitions data with little performance disruption to running transactions. DrTM+B uses a pre-copy-based mechanism to avoid excessive data transfer by leveraging common properties in recent transactional systems. DrTM+B's reconfiguration plans reduce data movement by preferring existing data replicas, while copying data from multiple replicas asynchronously and in parallel. It further reuses the log forwarding mechanism in primary-backup replication to seamlessly track and forward dirty database tuples and avoids iterative copying costs. To commit a reconfiguration plan in a transactional-safe way, DrTM+B designs a cooperative commit protocol for synchronization of data and state among replicas. To boost the performance during data migration, DrTM+B combines the pre-copy and post-copy schemes to propose a hybrid copy scheme. The live reconfiguration approach can also coexist with fault-tolerance mechanisms of primary-backup replication to provide high availability. Evaluation on a working system based on DrTM+R with 3-way replication using typical OLTP workloads like TPC-C and SmallBank shows that DrTM+B incurs only very small performance degradation during live reconfiguration and provides high availability. Both the reconfiguration time and the downtime are also minimal.
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2022.3148251