GraphMP: I/O-Efficient Big Graph Analytics on a Single Commodity Machine

Recent studies showed that single-machine graph processing systems can be as highly competitive as cluster-based approaches on large-scale problems. While several out-of-core graph processing systems and computation models have been proposed, the high disk I/O overhead could significantly reduce per...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on big data 2020-12, Vol.6 (4), p.816-829
Hauptverfasser: Sun, Peng, Wen, Yonggang, Duong, Ta Nguyen Binh, Xiao, Xiaokui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recent studies showed that single-machine graph processing systems can be as highly competitive as cluster-based approaches on large-scale problems. While several out-of-core graph processing systems and computation models have been proposed, the high disk I/O overhead could significantly reduce performance in many practical cases. In this paper, we propose GraphMP to tackle big graph analytics on a single machine. GraphMP achieves low disk I/O overhead with three techniques. First, we design a vertex-centric sliding window (VSW) computation model to avoid reading and writing vertices on disk. Second, we propose a selective scheduling method to skip loading and processing unnecessary edge shards on disk. Third, we use a compressed edge cache mechanism to fully utilize the available memory of a machine to reduce the amount of disk accesses for edges. Extensive evaluations have shown that GraphMP could outperform existing single-machine out-of-core systems such as GraphChi, X-Stream and GridGraph by up to 30, and can be as highly competitive as distributed graph engines like Pregel+, PowerGraph and Chaos.
ISSN:2332-7790
2332-7790
2372-2096
DOI:10.1109/TBDATA.2019.2908384