Cyclic Workflow Execution Mechanism on Top of MapReduce Framework
MapReduce programming model has been used in various kinds of intensive data processing and analysis projects for its ease of use and good scalability. In this paper, we discuss about the execution mechanism of cyclic workflow on top of MapReduce framework. A novel cycle elimination algorithm is pro...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | MapReduce programming model has been used in various kinds of intensive data processing and analysis projects for its ease of use and good scalability. In this paper, we discuss about the execution mechanism of cyclic workflow on top of MapReduce framework. A novel cycle elimination algorithm is proposed to decompose the cyclic workflow to DAG (Directed Acyclic Graph) sub-workflows. It dynamically and recursively searches for the maximum DAG sub-workflow according to current decision result of the decision node in each iteration. DAG sub-workflow scheduling strategy, which is comprised of DAG grouping mechanism and MapReduce task mapping, is also presented. Finally, we propose an intermediate data transmission mechanism named Partition Pushing, which can improve the possible parallelism between the executions of dependent jobs. Experiments show that our proposed workflow execution mechanism can schedule the cyclic workflow efficiently by improving the parallelism between dependent jobs and consequently reduce the workflow make span by 20%-60%. |
---|---|
DOI: | 10.1109/SKG.2011.46 |