RAFTing MapReduce: Fast recovery on the RAFT

MapReduce is a computing paradigm that has gained a lot of popularity as it allows non-expert users to easily run complex analytical tasks at very large-scale. At such scale, task and node failures are no longer an exception but rather a characteristic of large-scale systems. This makes fault-tolera...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Quiane-Ruiz, Jorge-Arnulfo, Pinkel, C, Schad, J, Dittrich, J
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Checkpointing Delay Fault tolerance Fault tolerant systems File systems Resumes Runtime
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	MapReduce is a computing paradigm that has gained a lot of popularity as it allows non-expert users to easily run complex analytical tasks at very large-scale. At such scale, task and node failures are no longer an exception but rather a characteristic of large-scale systems. This makes fault-tolerance a critical issue for the efficient operation of any application. MapReduce automatically reschedules failed tasks to available nodes, which in turn recompute such tasks from scratch. However, this policy can significantly decrease performance of applications. In this paper, we propose a family of Recovery Algorithms for Fast-Tracking (RAFT) MapReduce. As ease-of-use is a major feature of MapReduce, RAFT focuses on simplicity and also non-intrusiveness, in order to be implementation-independent. To efficiently recover from task failures, RAFT exploits the fact that MapReduce produces and persists intermediate results at several points in time. RAFT piggy-backs checkpoints on the task progress computation. To deal with multiple node failures, we propose query metadata checkpointing. We keep track of the mapping between input key-value pairs and intermediate data for all reduce tasks. Thereby, RAFT does not need to re-execute completed map tasks entirely. Instead RAFT only recomputes intermediate data that were processed for local reduce tasks and hence not shipped to another node for processing. We also introduce a scheduling strategy taking full advantage of these recovery algorithms. We implemented RAFT on top of Hadoop and evaluated it on a 45-node cluster using three common analytical tasks. Overall, our experimental results demonstrate that RAFT outperforms Hadoop runtimes by 23% on average under task and node failures. The results also show that RAFT has negligible runtime overhead.
ISSN:	1063-6382 2375-026X
DOI:	10.1109/ICDE.2011.5767877