Data-Aware Approximate Workflow Scheduling

Optimization of data placement in complex scientific workflows has become very crucial since the large amounts of data generated by these workflows significantly increases the turnaround time of the end-to-end application. It is almost impossible to make an optimal scheduling for the end-to-end work...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2018-05
Hauptverfasser: Yin, Dengpan, Kosar, Tevfik
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Optimization of data placement in complex scientific workflows has become very crucial since the large amounts of data generated by these workflows significantly increases the turnaround time of the end-to-end application. It is almost impossible to make an optimal scheduling for the end-to-end workflow without considering the intermediate data movement. In order to reduce the complexity of the workflow-scheduling problem, most of the existing work constrains the problem space by some unrealistic assumptions, which result in non-optimal scheduling in practice. In this study, we propose a genetic data-aware algorithm for the end-to-end workflow scheduling problem. Distinct from the past research, we develop a novel data-aware evaluation function for each chromosome, a common augmenting crossover operator and a simple but effective mutation operator. Our experiments on different workflow structures show that the proposed GA based approach gives a scheduling close to the optimal one.
ISSN:2331-8422