FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework
Dynamic graph random walk (DGRW) emerges as a practical tool for capturing structural relations within a graph. Effectively executing DGRW on GPU presents certain challenges. First, existing sampling methods demand a pre-processing buffer, causing substantial space complexity. Moreover, the power-la...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Mei, Junyi Sun, Shixuan Li, Chao Xu, Cheng Chen, Cheng Liu, Yibo Wang, Jing Zhao, Cheng Hou, Xiaofeng Guo, Minyi He, Bingsheng Cong, Xiaoliang |
description | Dynamic graph random walk (DGRW) emerges as a practical tool for capturing
structural relations within a graph. Effectively executing DGRW on GPU presents
certain challenges. First, existing sampling methods demand a pre-processing
buffer, causing substantial space complexity. Moreover, the power-law
distribution of graph vertex degrees introduces workload imbalance issues,
rendering DGRW embarrassed to parallelize. In this paper, we propose
FlowWalker, a GPU-based dynamic graph random walk framework. FlowWalker
implements an efficient parallel sampling method to fully exploit the GPU
parallelism and reduce space complexity. Moreover, it employs a sampler-centric
paradigm alongside a dynamic scheduling strategy to handle the huge amounts of
walking queries. FlowWalker stands as a memory-efficient framework that
requires no auxiliary data structures in GPU global memory. We examine the
performance of FlowWalker extensively on ten datasets, and experiment results
show that FlowWalker achieves up to 752.2x, 72.1x, and 16.4x speedup compared
with existing CPU, GPU, and FPGA random walk frameworks, respectively. Case
study shows that FlowWalker diminishes random walk time from 35% to 3% in a
pipeline of ByteDance friend recommendation GNN training. |
doi_str_mv | 10.48550/arxiv.2404.08364 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2404_08364</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2404_08364</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-126790257a38bc95530d8cd04aa94eb2ab8b4935f7fd6ebbe92d7540fae654643</originalsourceid><addsrcrecordid>eNotz71OwzAYhWEvDKhwAUz4Bhxc_yVhqwpJkYpAqIgx-hx_plbjJHIRJXdPW5jOcvRKDyE3c56pQmt-B-knfGdCcZXxQhp1SaDqhsMHdDtM93RBnzEOaWLofWgD9l8UekdX4XPLRkx-SBH6Fmn9-s4s7NHRh6mHGFpaJxi39O34HiI95WiVIOJhSLsrcuGh2-P1_87IpnrcLFds_VI_LRdrBiZXbC5MXnKhc5CFbUutJXdF67gCKBVaAbawqpTa594ZtBZL4XKtuAc0WhklZ-T2L3s2NmMKEdLUnKzN2Sp_AXXgTt0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework</title><source>arXiv.org</source><creator>Mei, Junyi ; Sun, Shixuan ; Li, Chao ; Xu, Cheng ; Chen, Cheng ; Liu, Yibo ; Wang, Jing ; Zhao, Cheng ; Hou, Xiaofeng ; Guo, Minyi ; He, Bingsheng ; Cong, Xiaoliang</creator><creatorcontrib>Mei, Junyi ; Sun, Shixuan ; Li, Chao ; Xu, Cheng ; Chen, Cheng ; Liu, Yibo ; Wang, Jing ; Zhao, Cheng ; Hou, Xiaofeng ; Guo, Minyi ; He, Bingsheng ; Cong, Xiaoliang</creatorcontrib><description>Dynamic graph random walk (DGRW) emerges as a practical tool for capturing
structural relations within a graph. Effectively executing DGRW on GPU presents
certain challenges. First, existing sampling methods demand a pre-processing
buffer, causing substantial space complexity. Moreover, the power-law
distribution of graph vertex degrees introduces workload imbalance issues,
rendering DGRW embarrassed to parallelize. In this paper, we propose
FlowWalker, a GPU-based dynamic graph random walk framework. FlowWalker
implements an efficient parallel sampling method to fully exploit the GPU
parallelism and reduce space complexity. Moreover, it employs a sampler-centric
paradigm alongside a dynamic scheduling strategy to handle the huge amounts of
walking queries. FlowWalker stands as a memory-efficient framework that
requires no auxiliary data structures in GPU global memory. We examine the
performance of FlowWalker extensively on ten datasets, and experiment results
show that FlowWalker achieves up to 752.2x, 72.1x, and 16.4x speedup compared
with existing CPU, GPU, and FPGA random walk frameworks, respectively. Case
study shows that FlowWalker diminishes random walk time from 35% to 3% in a
pipeline of ByteDance friend recommendation GNN training.</description><identifier>DOI: 10.48550/arxiv.2404.08364</identifier><language>eng</language><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><creationdate>2024-04</creationdate><rights>http://creativecommons.org/licenses/by-nc-nd/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2404.08364$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2404.08364$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Mei, Junyi</creatorcontrib><creatorcontrib>Sun, Shixuan</creatorcontrib><creatorcontrib>Li, Chao</creatorcontrib><creatorcontrib>Xu, Cheng</creatorcontrib><creatorcontrib>Chen, Cheng</creatorcontrib><creatorcontrib>Liu, Yibo</creatorcontrib><creatorcontrib>Wang, Jing</creatorcontrib><creatorcontrib>Zhao, Cheng</creatorcontrib><creatorcontrib>Hou, Xiaofeng</creatorcontrib><creatorcontrib>Guo, Minyi</creatorcontrib><creatorcontrib>He, Bingsheng</creatorcontrib><creatorcontrib>Cong, Xiaoliang</creatorcontrib><title>FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework</title><description>Dynamic graph random walk (DGRW) emerges as a practical tool for capturing
structural relations within a graph. Effectively executing DGRW on GPU presents
certain challenges. First, existing sampling methods demand a pre-processing
buffer, causing substantial space complexity. Moreover, the power-law
distribution of graph vertex degrees introduces workload imbalance issues,
rendering DGRW embarrassed to parallelize. In this paper, we propose
FlowWalker, a GPU-based dynamic graph random walk framework. FlowWalker
implements an efficient parallel sampling method to fully exploit the GPU
parallelism and reduce space complexity. Moreover, it employs a sampler-centric
paradigm alongside a dynamic scheduling strategy to handle the huge amounts of
walking queries. FlowWalker stands as a memory-efficient framework that
requires no auxiliary data structures in GPU global memory. We examine the
performance of FlowWalker extensively on ten datasets, and experiment results
show that FlowWalker achieves up to 752.2x, 72.1x, and 16.4x speedup compared
with existing CPU, GPU, and FPGA random walk frameworks, respectively. Case
study shows that FlowWalker diminishes random walk time from 35% to 3% in a
pipeline of ByteDance friend recommendation GNN training.</description><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz71OwzAYhWEvDKhwAUz4Bhxc_yVhqwpJkYpAqIgx-hx_plbjJHIRJXdPW5jOcvRKDyE3c56pQmt-B-knfGdCcZXxQhp1SaDqhsMHdDtM93RBnzEOaWLofWgD9l8UekdX4XPLRkx-SBH6Fmn9-s4s7NHRh6mHGFpaJxi39O34HiI95WiVIOJhSLsrcuGh2-P1_87IpnrcLFds_VI_LRdrBiZXbC5MXnKhc5CFbUutJXdF67gCKBVaAbawqpTa594ZtBZL4XKtuAc0WhklZ-T2L3s2NmMKEdLUnKzN2Sp_AXXgTt0</recordid><startdate>20240412</startdate><enddate>20240412</enddate><creator>Mei, Junyi</creator><creator>Sun, Shixuan</creator><creator>Li, Chao</creator><creator>Xu, Cheng</creator><creator>Chen, Cheng</creator><creator>Liu, Yibo</creator><creator>Wang, Jing</creator><creator>Zhao, Cheng</creator><creator>Hou, Xiaofeng</creator><creator>Guo, Minyi</creator><creator>He, Bingsheng</creator><creator>Cong, Xiaoliang</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240412</creationdate><title>FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework</title><author>Mei, Junyi ; Sun, Shixuan ; Li, Chao ; Xu, Cheng ; Chen, Cheng ; Liu, Yibo ; Wang, Jing ; Zhao, Cheng ; Hou, Xiaofeng ; Guo, Minyi ; He, Bingsheng ; Cong, Xiaoliang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-126790257a38bc95530d8cd04aa94eb2ab8b4935f7fd6ebbe92d7540fae654643</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Distributed, Parallel, and Cluster Computing</topic><toplevel>online_resources</toplevel><creatorcontrib>Mei, Junyi</creatorcontrib><creatorcontrib>Sun, Shixuan</creatorcontrib><creatorcontrib>Li, Chao</creatorcontrib><creatorcontrib>Xu, Cheng</creatorcontrib><creatorcontrib>Chen, Cheng</creatorcontrib><creatorcontrib>Liu, Yibo</creatorcontrib><creatorcontrib>Wang, Jing</creatorcontrib><creatorcontrib>Zhao, Cheng</creatorcontrib><creatorcontrib>Hou, Xiaofeng</creatorcontrib><creatorcontrib>Guo, Minyi</creatorcontrib><creatorcontrib>He, Bingsheng</creatorcontrib><creatorcontrib>Cong, Xiaoliang</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mei, Junyi</au><au>Sun, Shixuan</au><au>Li, Chao</au><au>Xu, Cheng</au><au>Chen, Cheng</au><au>Liu, Yibo</au><au>Wang, Jing</au><au>Zhao, Cheng</au><au>Hou, Xiaofeng</au><au>Guo, Minyi</au><au>He, Bingsheng</au><au>Cong, Xiaoliang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework</atitle><date>2024-04-12</date><risdate>2024</risdate><abstract>Dynamic graph random walk (DGRW) emerges as a practical tool for capturing
structural relations within a graph. Effectively executing DGRW on GPU presents
certain challenges. First, existing sampling methods demand a pre-processing
buffer, causing substantial space complexity. Moreover, the power-law
distribution of graph vertex degrees introduces workload imbalance issues,
rendering DGRW embarrassed to parallelize. In this paper, we propose
FlowWalker, a GPU-based dynamic graph random walk framework. FlowWalker
implements an efficient parallel sampling method to fully exploit the GPU
parallelism and reduce space complexity. Moreover, it employs a sampler-centric
paradigm alongside a dynamic scheduling strategy to handle the huge amounts of
walking queries. FlowWalker stands as a memory-efficient framework that
requires no auxiliary data structures in GPU global memory. We examine the
performance of FlowWalker extensively on ten datasets, and experiment results
show that FlowWalker achieves up to 752.2x, 72.1x, and 16.4x speedup compared
with existing CPU, GPU, and FPGA random walk frameworks, respectively. Case
study shows that FlowWalker diminishes random walk time from 35% to 3% in a
pipeline of ByteDance friend recommendation GNN training.</abstract><doi>10.48550/arxiv.2404.08364</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2404.08364 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2404_08364 |
source | arXiv.org |
subjects | Computer Science - Distributed, Parallel, and Cluster Computing |
title | FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T13%3A36%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=FlowWalker:%20A%20Memory-efficient%20and%20High-performance%20GPU-based%20Dynamic%20Graph%20Random%20Walk%20Framework&rft.au=Mei,%20Junyi&rft.date=2024-04-12&rft_id=info:doi/10.48550/arxiv.2404.08364&rft_dat=%3Carxiv_GOX%3E2404_08364%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |