Joint scheduling of processing and Shuffle phases in MapReduce systems

MapReduce has emerged as an important paradigm for processing data in large data centers. MapReduce is a three phase algorithm comprising of Map, Shuffle and Reduce phases. Due to its widespread deployment, there have been several recent papers outlining practical schemes to improve the performance...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Fangfei Chen, Kodialam, M., Lakshman, T. V.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:MapReduce has emerged as an important paradigm for processing data in large data centers. MapReduce is a three phase algorithm comprising of Map, Shuffle and Reduce phases. Due to its widespread deployment, there have been several recent papers outlining practical schemes to improve the performance of MapReduce systems. All these efforts focus on one of the three phases to obtain performance improvement. In this paper, we consider the problem of jointly scheduling all three phases of the MapReduce process with a view of understanding the theoretical complexity of the joint scheduling and working towards practical heuristics for scheduling the tasks. We give guaranteed approximation algorithms and outline several heuristics to solve the joint scheduling problem.
ISSN:0743-166X
2641-9874
DOI:10.1109/INFCOM.2012.6195473