Insight and reduction of MapReduce stragglers in heterogeneous environment

Speculative and clone execution are existing techniques to overcome the problems of task stragglers and performance degradation in heterogeneous clusters for big data processing. In this paper, we propose an alternative approach to solving the problems based on analysis results of profiling and the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Xia Zhao, Kai Kang, YuZhong Sun, Yin Song, Minhao Xu, Tao Pan
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Speculative and clone execution are existing techniques to overcome the problems of task stragglers and performance degradation in heterogeneous clusters for big data processing. In this paper, we propose an alternative approach to solving the problems based on analysis results of profiling and the relations of the system parameters. Our approach adjusts the amount of task slots of nodes dynamically to match the processing power of the nodes, according to current task progress rate and resource utilization. It contrasts with the existing techniques by attempting to prevent task stragglers from occurring in the first place through maintaining a balance between resource supply and demand. We have implemented this method in the Hadoop MapReduce platform, and the TPC-H benchmark results show that it achieves 20-30% performance improvement and 35-88% less stragglers than existing techniques.
ISSN:1552-5244
2168-9253
DOI:10.1109/CLUSTER.2013.6702673