Scheduling Resources to Multiple Pipelines of One Query in a Main Memory Database Cluster

To fully utilize the resources of a main memory database cluster, we additionally take the independent parallelism into account to parallelize multiple pipelines of one query. However, scheduling resources to multiple pipelines is an intractable problem. Traditional static approaches to this problem...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on knowledge and data engineering 2020-03, Vol.32 (3), p.533-546
Hauptverfasser: Fang, Zhuhe, Weng, Chuliang, Wang, Li, Hu, Huiqi, Zhou, Aoying
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To fully utilize the resources of a main memory database cluster, we additionally take the independent parallelism into account to parallelize multiple pipelines of one query. However, scheduling resources to multiple pipelines is an intractable problem. Traditional static approaches to this problem may lead to a serious waste of resources and suboptimal execution order of pipelines, because it is hard to predict the actual data distribution and fluctuating workloads at compile time. In response, we propose a dynamic scheduling algorithm, List with Filling and Preemption (LFPS), based on two novel techniques. (1) Adaptive filling improves resource utilization by issuing more extra pipelines to adaptively fill idle resource "holes" during execution. (2) Rank-based preemption strictly guarantees scheduling the pipelines on the critical path first at run time. Interestingly, the latter facilitates the former filling idle "holes" with best efforts to finish multiple pipelines as soon as possible. We implement LFPS in our prototype database system. Under the workloads of TPC-H, experiments show our work improves the finish time of parallelizable pipelines from one query up to 2.5X than a static approach and 2.1X than a serialized execution.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2018.2884724