An Analytical Approach to Evaluation of SSD Effects under MapReduce Workloads
As the cost-per-byte of SSDs dramatically decreases, the introduction of SSDs to Hadoop becomes an attractive choice for high performance data processing. In this paper the cost-perperformance of SSD-based Hadoop cluster (SSDHadoop) and HDD-based Hadoop cluster (HDDHadoop) are evaluated. For this, w...
Gespeichert in:
Veröffentlicht in: | Journal of semiconductor technology and science 2015, 15(5), 65, pp.511-518 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | As the cost-per-byte of SSDs dramatically decreases, the introduction of SSDs to Hadoop becomes an attractive choice for high performance data processing. In this paper the cost-perperformance of SSD-based Hadoop cluster (SSDHadoop) and HDD-based Hadoop cluster (HDDHadoop) are evaluated. For this, we propose a MapReduce performance model using queuing network to simulate the execution time of MapReduce job with varying cluster size. To achieve an accurate model, the execution time distribution of MapReduce job is carefully profiled. The developed model can precisely predict the execution time of MapReduce jobs with less than 7% difference for most cases. It is also found that SSD-Hadoop is 20% more cost efficient than HDD-Hadoop because SSD-Hadoop needs a smaller number of nodes than HDD-Hadoop to achieve a comparable performance, according to the results of simulation with varying the number of cluster nodes. KCI Citation Count: 4 |
---|---|
ISSN: | 1598-1657 2233-4866 |
DOI: | 10.5573/JSTS.2015.15.5.511 |