Early straggler tasks detection by recurrent neural network in a heterogeneous environment
Heterogeneity is common in parallel and distributed environments used for extensive computations such as MapReduce. Stragglers are the tasks that are running on inferior performing nodes in the cluster. Early detection of stragglers is always challenging in such environments. In the previously propo...
Gespeichert in:
Veröffentlicht in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-04, Vol.53 (7), p.7369-7389 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Heterogeneity is common in parallel and distributed environments used for extensive computations such as MapReduce. Stragglers are the tasks that are running on inferior performing nodes in the cluster. Early detection of stragglers is always challenging in such environments. In the previously proposed approaches, late detection of straggler tasks and estimation of time to end (TTE) for all the tasks running in a heterogeneous environment delays the entire job execution. Early straggler detection help to speculate a task at the early stages of task execution which indirectly improves the complete job execution. This article proposed early straggler detection by a recurrent neural network (ESDRNN) that collects the task and node information every three seconds from ApplicationMaster to train the RNN. It classifies the straggler tasks pretty early by RNN, between thirty to forty seconds of task execution, and transfers a list of classified tasks to an agent running on ResourceManager. RNN is a type of artificial neural network that is prevalent for processing sequential time-series data. Then, the agent predicts the TTE of these classified tasks by the Autoregressive integrated moving average (ARIMA) model. Finally, it sorts and refreshes the list with higher TTE after every ten seconds and speculates the tasks for the early completion of the MapReduce job. This proposed technique’s performance is evaluated on the HiBench benchmark suite of Hadoop’s most popular benchmark. Finally, compared with the default speculation technique and different techniques, the proposed speculation technique detects the stragglers early within 35 to 40 seconds of task execution. As a result, it decreases the job execution time by an average of 21% to 38% significantly for different workloads in a heterogeneous Hadoop cluster. |
---|---|
ISSN: | 0924-669X 1573-7497 |
DOI: | 10.1007/s10489-022-03837-1 |