Estimating graph distance and centrality on shared nothing architectures

Summary We present a parallel toolkit for pairwise distance computation in massive networks. Computing the exact shortest paths between a large number of vertices is a costly operation, and serial algorithms are not practical for billion‐scale graphs. We first describe an efficient parallel method t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Concurrency and computation 2015-09, Vol.27 (14), p.3587-3613
Hauptverfasser: Balkir, Atilla Soner, Oktay, Huseyin, Foster, Ian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Summary We present a parallel toolkit for pairwise distance computation in massive networks. Computing the exact shortest paths between a large number of vertices is a costly operation, and serial algorithms are not practical for billion‐scale graphs. We first describe an efficient parallel method to solve the single source shortest path problem on commodity hardware with no shared memory. Using it as a building block, we introduce a new parallel algorithm to estimate the shortest paths between arbitrary pairs of vertices. Our method exploits data locality, produces highly accurate results, and allows batch computation of shortest paths with 7% average error in graphs that contain billions of edges. The proposed algorithm is up to two orders of magnitude faster than previously suggested algorithms and does not require large amounts of memory or expensive high‐end servers. We further leverage this method to estimate the closeness and betweenness centrality metrics, which involve systems challenges dealing with indexing, joining, and comparing large datasets efficiently. In one experiment, we mined a real‐world Web graph with 700 million nodes and 12 billion edges to identify the most central vertices and calculated more than 63 billion shortest paths in 6 h on a 20‐node commodity cluster. Copyright © 2014 John Wiley & Sons, Ltd.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.3354