An Analytical Approach to Evaluation of SSD Effects under MapReduce Workloads

As the cost-per-byte of SSDs dramatically decreases, the introduction of SSDs to Hadoop becomes an attractive choice for high performance data processing. In this paper the cost-perperformance of SSD-based Hadoop cluster (SSDHadoop) and HDD-based Hadoop cluster (HDDHadoop) are evaluated. For this, w...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of semiconductor technology and science 2015, 15(5), 65, pp.511-518
Hauptverfasser: Ahn, Sungyong, Park, Sangkyu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:As the cost-per-byte of SSDs dramatically decreases, the introduction of SSDs to Hadoop becomes an attractive choice for high performance data processing. In this paper the cost-perperformance of SSD-based Hadoop cluster (SSDHadoop) and HDD-based Hadoop cluster (HDDHadoop) are evaluated. For this, we propose a MapReduce performance model using queuing network to simulate the execution time of MapReduce job with varying cluster size. To achieve an accurate model, the execution time distribution of MapReduce job is carefully profiled. The developed model can precisely predict the execution time of MapReduce jobs with less than 7% difference for most cases. It is also found that SSD-Hadoop is 20% more cost efficient than HDD-Hadoop because SSD-Hadoop needs a smaller number of nodes than HDD-Hadoop to achieve a comparable performance, according to the results of simulation with varying the number of cluster nodes. KCI Citation Count: 4
ISSN:1598-1657
2233-4866
DOI:10.5573/JSTS.2015.15.5.511