SMART REDUCE TASK SCHEDULER
A system and a method for scheduling a reduce task on nodes is disclosed. The various nodes in a cluster of nodes are bucketized into intermediate data items. A counter is created that provides a count of the intermediate data items that are placed into the each of the buckets for the node. This cou...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A system and a method for scheduling a reduce task on nodes is disclosed. The various nodes in a cluster of nodes are bucketized into intermediate data items. A counter is created that provides a count of the intermediate data items that are placed into the each of the buckets for the node. This counter value is provided to a scheduler. From the counter information the scheduler is able to determine the cost of moving the intermediate data for the bucket to different ones of the nodes. Once the cost of moving the intermediate data is determined the scheduler is able to determine which of the nodes should perform the reduce task for that particular bucket. The scheduler minimizes the amount of shuffling of the intermediate data between the nodes for each of the buckets, by determining the lowest cost shuffle option for each of the buckets. |
---|