Distributed big data parallel computing method based on Hadoop MapReduce

The invention relates to a distributed big data parallel computing method based on Hadoop MapReduce. The distributed big data parallel computing method comprises Map, Shuffle, and Reduce steps of a Hadoop framework, wherein a GPU computing module is added between a Hadoop MapReduce framework and a u...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: LI PENG, DING GANGYI, HUANG TIANYU, MAO XUKUN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention relates to a distributed big data parallel computing method based on Hadoop MapReduce. The distributed big data parallel computing method comprises Map, Shuffle, and Reduce steps of a Hadoop framework, wherein a GPU computing module is added between a Hadoop MapReduce framework and a user; the user submits a specific Map function and a specific Reduce function to a GPU computing module, and the GPU computing module processes a whole data block distributed by a working node as a value of a key value pair through an interface provided by Hadoop before the Map step; in the Map step,the GPU computing module packages the Map function submitted by the user into a new Map function and submits the new Map function to the Hadoop framework; and the new Map function receives the data block from the Hadoop framework, and key value pairs are further divided, and each key value pair is allocated to different GPU threads, and each GPU thread calls the Map function submitted by the userfor parallel computing. A