Analyzing performance of Apache Tez and MapReduce with hadoop multinode cluster on Amazon cloud

Big Data is the term used for larger data sets that are very complex and not easily processed by the traditional devices. Today is the need of the new technology for processing these large data sets. Apache Hadoop is the good option and it has many components that worked together to make the hadoop...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of big data 2016-10, Vol.3 (1), p.1-10, Article 19
Hauptverfasser:	Singh, Rupinder, Kaur, Puneet Jai
Format:	Artikel
Sprache:	eng
Schlagworte:	Big Data Clusters Communications Engineering Computational Science and Engineering Computer Science Data management Data Mining and Knowledge Discovery Database Management Datasets Distributed processing Empirical analysis Employment Information Storage and Retrieval Mathematical Applications in Computer Science Networks New technology Scripts
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Big Data is the term used for larger data sets that are very complex and not easily processed by the traditional devices. Today is the need of the new technology for processing these large data sets. Apache Hadoop is the good option and it has many components that worked together to make the hadoop ecosystem robust and efficient. Apache Pig is the core component of hadoop ecosystem and it accepts the tasks in the form of scripts. To run these scripts Apache Pig may use MapReduce or Apache Tez framework. In our previous paper we analyze how these two frameworks different from each other on the basis of some parameters chosen. We compare both the frameworks in theoretical and empirical way on the single node cluster. Here, in this paper we try to perform the analysis on multinode cluster which is installed at Amazon cloud.
ISSN:	2196-1115 2196-1115
DOI:	10.1186/s40537-016-0051-6