HFSP: Bringing Size-Based Scheduling To Hadoop

Size-based scheduling with aging has been recognized as an effective approach to guarantee fairness and near-optimal system response times. We present HFSP, a scheduler introducing this technique to a real, multi-server, complex, and widely used system such as Hadoop. Size-based scheduling requires...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on cloud computing 2017-01, Vol.5 (1), p.43-56
Hauptverfasser:	Pastorelli, Mario, Carra, Damiano, DellAmico, Matteo, Michiardi, Pietro
Format:	Artikel
Sprache:	eng
Schlagworte:	Aging Batch processing Cloud computing data analysis Estimation MapReduce Processor scheduling Scheduling Silicon Time factors Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Size-based scheduling with aging has been recognized as an effective approach to guarantee fairness and near-optimal system response times. We present HFSP, a scheduler introducing this technique to a real, multi-server, complex, and widely used system such as Hadoop. Size-based scheduling requires a priori job size information, which is not available in Hadoop: HFSP builds such knowledge by estimating it on-line during job execution. Our experiments, which are based on realistic workloads generated via a standard benchmarking suite, pinpoint at a significant decrease in system response times with respect to the widely used Hadoop Fair scheduler, without impacting the fairness of the scheduler, and show that HFSP is largely tolerant to job size estimation errors.
ISSN:	2168-7161 2168-7161 2372-0018
DOI:	10.1109/TCC.2015.2396056