Minimizing total job completion time in MapReduce scheduling

•Study MapReduce scheduling as a generalization of two stage flow-shop scheduling.•prove a new lower bound on the total job completion time.•present an improved (9–1/m_1)-approximation algorithm.•design numerical experiments to demonstrate effectiveness of the new lower bound.•demonstrate the practi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers & industrial engineering 2021-08, Vol.158, p.107387, Article 107387
Hauptverfasser: Dong, Jianming, Goebel, Randy, Hu, Jueliang, Lin, Guohui, Su, Bing
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Study MapReduce scheduling as a generalization of two stage flow-shop scheduling.•prove a new lower bound on the total job completion time.•present an improved (9–1/m_1)-approximation algorithm.•design numerical experiments to demonstrate effectiveness of the new lower bound.•demonstrate the practical outperformance of the approximation algorithm. We follow up an earlier studied multiple-task parallel-machine scheduling model that captures the core challenges in MapReduce scheduling, with the optimization goal to minimize the total job completion time. The problem is a novel generalization of the classic two-stage flow-shop scheduling, in which we are given a set of jobs each is associated with multiple map tasks and multiple reduce tasks. All these tasks are non-preemptive to be processed on multiple parallel identical map machines and multiple parallel identical reduce machines, respectively, under the strict precedence constraints that, for each job, any reduce task cannot start before all its map tasks have been finished. We prove a new lower bound on the total job completion time, and based on which we present an O(nlogn+N+m1+m2)-time (9-3m1)-approximation algorithm, where n and N are the number of jobs and the total number of tasks, respectively, m1 and m2 are the numbers of map and reduce machines, respectively, and they can even be part of the input. Our algorithm improves the previous best 12-approximation, and it reduces to the best approximation algorithms for several interesting or well studied special cases. We confirm through numerical experiments on 828,100 instances that both our lower bound and our algorithm significantly outperform: the empirical mean improvement ratio of the new lower bound is as high as 37.75%, and the empirical mean approximation ratio of our algorithm is only 1.7582.
ISSN:0360-8352
1879-0550
DOI:10.1016/j.cie.2021.107387