Quasi-Streaming Graph Partitioning: A Game Theoretical Approach

Graph partitioning is a fundamental problem to enable scalable graph computation on large graphs. Existing partitioning models are either streaming based or offline based. In the streaming model, the current edge needs all previous edges' partition choices to make a decision. As a result, it is...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems 2019-07, Vol.30 (7), p.1643-1656
Hauptverfasser: Hua, Qiang-Sheng, Li, Yangyang, Yu, Dongxiao, Jin, Hai
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Graph partitioning is a fundamental problem to enable scalable graph computation on large graphs. Existing partitioning models are either streaming based or offline based. In the streaming model, the current edge needs all previous edges' partition choices to make a decision. As a result, it is hard to carry out partitioning in parallel. Besides, offline based partitioning requires full knowledge about the input graph which may not suit well for large graphs. In this work, we propose a quasi-streaming partitioning model and a game theory based solution for the edge partitioning problem. Specifically, we separate the whole edge stream into a series of batches where the batch size is a constant multiple of the number of partitions. In each batch, we model the graph edge partitioning problem as a game process, where the edge's partition choice is regarded as a rational strategy choice of the player in the game. As a result, the edge partitioning problem is decomposed into finding Nash Equilibriums in a series of game processes. We mathematically prove the existence of Nash Equilibrium in such a game process, and analyze the number of rounds needed to converge into a Nash Equilibrium. We further measure the quality of these Nash Equilibriums via computing the PoA (Price of Anarchy), which is bounded by the number of partitions. Then we evaluate the performance of our strategy via comprehensive experiments on both real-world graphs and random graphs. Results show that our solution achieves significant improvements on load balance and replication factor when compared with five exsiting streaming partitioning strategies.
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2018.2890515