Methods for determining the resources needed to create MapReduce computational models

A description of the mathematical model of the cost of model resources when executing MapReduce jobs in distributed environments is given. It was revealed that the total worth of all model resources consists of the cost of preparing the cluster for operation and launching its tasks, as well as the w...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Urazmatov, Tokhir, Kuzibaev, Khudaysukur, Otamuratov, Khurmatbek, Gulomov, Azizbek
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A description of the mathematical model of the cost of model resources when executing MapReduce jobs in distributed environments is given. It was revealed that the total worth of all model resources consists of the cost of preparing the cluster for operation and launching its tasks, as well as the worth of resources, necessary for the immediate performance of these tasks. The main parameters that affect the speed of execution and the amount of resources consumed when solving problems using the MapReduce paradigm are considered. Defined four classes of parameters involved in MapReduce optimization: data flow, cost fields, data flow statistics, and cost field statistics. A sign of linearity of the worth model was established when solving the problem of finding the frequency of words in an array of documents. It is revealed that linearity is preserved when solving problems requiring sequential application of several MapReduce-computation models.
ISSN:0094-243X
1551-7616
DOI:10.1063/5.0190710