Distributed big data processing method

The invention provides a distributed big data processing method, and relates to the technical field of data processing. Nodes in a hypercube data model are divided into two sub-hypercubes, data in each sub-hypercube is processed, and along with variation of the scale n, the time complexity of a hype...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: ZHANG QUANYOU, KOU QIONGJIE, WU JUNHONG, QIAN HEPING, TAO ZHANGANG
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides a distributed big data processing method, and relates to the technical field of data processing. Nodes in a hypercube data model are divided into two sub-hypercubes, data in each sub-hypercube is processed, and along with variation of the scale n, the time complexity of a hypercube model distributed algorithm is obviously lower than that of a timestamp distributed algorithm and a DFS (depth-first-search) minimum spanning tree distributed algorithm. When n is greater than k, the efficiency of the hypercube model distributed algorithm is obviously higher than that of the timestamp distributed algorithm and the DFS minimum spanning tree distributed algorithm. 本发明提供了种分布式大数据处理方法,涉及数据处理技术领域。将超立方体数据模型中的节点划分为两个子超立方体,然后分别对每个子超立方体中的数据进行处理,随着规模n的变化,超立方体模型分布式算法的时间复杂度明显低于时戳分布式算法和DFS最小生成树分布式算法的时间复杂度。当n>k时,超立方体模型分布式算法的效率明显高于时戳分布式算法和DFS最小生成树分布式算法的效率。