FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework

Dynamic graph random walk (DGRW) emerges as a practical tool for capturing structural relations within a graph. Effectively executing DGRW on GPU presents certain challenges. First, existing sampling methods demand a pre-processing buffer, causing substantial space complexity. Moreover, the power-la...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Mei, Junyi, Sun, Shixuan, Li, Chao, Xu, Cheng, Chen, Cheng, Liu, Yibo, Wang, Jing, Zhao, Cheng, Hou, Xiaofeng, Guo, Minyi, He, Bingsheng, Cong, Xiaoliang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Mei, Junyi
Sun, Shixuan
Li, Chao
Xu, Cheng
Chen, Cheng
Liu, Yibo
Wang, Jing
Zhao, Cheng
Hou, Xiaofeng
Guo, Minyi
He, Bingsheng
Cong, Xiaoliang
description Dynamic graph random walk (DGRW) emerges as a practical tool for capturing structural relations within a graph. Effectively executing DGRW on GPU presents certain challenges. First, existing sampling methods demand a pre-processing buffer, causing substantial space complexity. Moreover, the power-law distribution of graph vertex degrees introduces workload imbalance issues, rendering DGRW embarrassed to parallelize. In this paper, we propose FlowWalker, a GPU-based dynamic graph random walk framework. FlowWalker implements an efficient parallel sampling method to fully exploit the GPU parallelism and reduce space complexity. Moreover, it employs a sampler-centric paradigm alongside a dynamic scheduling strategy to handle the huge amounts of walking queries. FlowWalker stands as a memory-efficient framework that requires no auxiliary data structures in GPU global memory. We examine the performance of FlowWalker extensively on ten datasets, and experiment results show that FlowWalker achieves up to 752.2x, 72.1x, and 16.4x speedup compared with existing CPU, GPU, and FPGA random walk frameworks, respectively. Case study shows that FlowWalker diminishes random walk time from 35% to 3% in a pipeline of ByteDance friend recommendation GNN training.
doi_str_mv 10.48550/arxiv.2404.08364
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2404_08364</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2404_08364</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-126790257a38bc95530d8cd04aa94eb2ab8b4935f7fd6ebbe92d7540fae654643</originalsourceid><addsrcrecordid>eNotz71OwzAYhWEvDKhwAUz4Bhxc_yVhqwpJkYpAqIgx-hx_plbjJHIRJXdPW5jOcvRKDyE3c56pQmt-B-knfGdCcZXxQhp1SaDqhsMHdDtM93RBnzEOaWLofWgD9l8UekdX4XPLRkx-SBH6Fmn9-s4s7NHRh6mHGFpaJxi39O34HiI95WiVIOJhSLsrcuGh2-P1_87IpnrcLFds_VI_LRdrBiZXbC5MXnKhc5CFbUutJXdF67gCKBVaAbawqpTa594ZtBZL4XKtuAc0WhklZ-T2L3s2NmMKEdLUnKzN2Sp_AXXgTt0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework</title><source>arXiv.org</source><creator>Mei, Junyi ; Sun, Shixuan ; Li, Chao ; Xu, Cheng ; Chen, Cheng ; Liu, Yibo ; Wang, Jing ; Zhao, Cheng ; Hou, Xiaofeng ; Guo, Minyi ; He, Bingsheng ; Cong, Xiaoliang</creator><creatorcontrib>Mei, Junyi ; Sun, Shixuan ; Li, Chao ; Xu, Cheng ; Chen, Cheng ; Liu, Yibo ; Wang, Jing ; Zhao, Cheng ; Hou, Xiaofeng ; Guo, Minyi ; He, Bingsheng ; Cong, Xiaoliang</creatorcontrib><description>Dynamic graph random walk (DGRW) emerges as a practical tool for capturing structural relations within a graph. Effectively executing DGRW on GPU presents certain challenges. First, existing sampling methods demand a pre-processing buffer, causing substantial space complexity. Moreover, the power-law distribution of graph vertex degrees introduces workload imbalance issues, rendering DGRW embarrassed to parallelize. In this paper, we propose FlowWalker, a GPU-based dynamic graph random walk framework. FlowWalker implements an efficient parallel sampling method to fully exploit the GPU parallelism and reduce space complexity. Moreover, it employs a sampler-centric paradigm alongside a dynamic scheduling strategy to handle the huge amounts of walking queries. FlowWalker stands as a memory-efficient framework that requires no auxiliary data structures in GPU global memory. We examine the performance of FlowWalker extensively on ten datasets, and experiment results show that FlowWalker achieves up to 752.2x, 72.1x, and 16.4x speedup compared with existing CPU, GPU, and FPGA random walk frameworks, respectively. Case study shows that FlowWalker diminishes random walk time from 35% to 3% in a pipeline of ByteDance friend recommendation GNN training.</description><identifier>DOI: 10.48550/arxiv.2404.08364</identifier><language>eng</language><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><creationdate>2024-04</creationdate><rights>http://creativecommons.org/licenses/by-nc-nd/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2404.08364$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2404.08364$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Mei, Junyi</creatorcontrib><creatorcontrib>Sun, Shixuan</creatorcontrib><creatorcontrib>Li, Chao</creatorcontrib><creatorcontrib>Xu, Cheng</creatorcontrib><creatorcontrib>Chen, Cheng</creatorcontrib><creatorcontrib>Liu, Yibo</creatorcontrib><creatorcontrib>Wang, Jing</creatorcontrib><creatorcontrib>Zhao, Cheng</creatorcontrib><creatorcontrib>Hou, Xiaofeng</creatorcontrib><creatorcontrib>Guo, Minyi</creatorcontrib><creatorcontrib>He, Bingsheng</creatorcontrib><creatorcontrib>Cong, Xiaoliang</creatorcontrib><title>FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework</title><description>Dynamic graph random walk (DGRW) emerges as a practical tool for capturing structural relations within a graph. Effectively executing DGRW on GPU presents certain challenges. First, existing sampling methods demand a pre-processing buffer, causing substantial space complexity. Moreover, the power-law distribution of graph vertex degrees introduces workload imbalance issues, rendering DGRW embarrassed to parallelize. In this paper, we propose FlowWalker, a GPU-based dynamic graph random walk framework. FlowWalker implements an efficient parallel sampling method to fully exploit the GPU parallelism and reduce space complexity. Moreover, it employs a sampler-centric paradigm alongside a dynamic scheduling strategy to handle the huge amounts of walking queries. FlowWalker stands as a memory-efficient framework that requires no auxiliary data structures in GPU global memory. We examine the performance of FlowWalker extensively on ten datasets, and experiment results show that FlowWalker achieves up to 752.2x, 72.1x, and 16.4x speedup compared with existing CPU, GPU, and FPGA random walk frameworks, respectively. Case study shows that FlowWalker diminishes random walk time from 35% to 3% in a pipeline of ByteDance friend recommendation GNN training.</description><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz71OwzAYhWEvDKhwAUz4Bhxc_yVhqwpJkYpAqIgx-hx_plbjJHIRJXdPW5jOcvRKDyE3c56pQmt-B-knfGdCcZXxQhp1SaDqhsMHdDtM93RBnzEOaWLofWgD9l8UekdX4XPLRkx-SBH6Fmn9-s4s7NHRh6mHGFpaJxi39O34HiI95WiVIOJhSLsrcuGh2-P1_87IpnrcLFds_VI_LRdrBiZXbC5MXnKhc5CFbUutJXdF67gCKBVaAbawqpTa594ZtBZL4XKtuAc0WhklZ-T2L3s2NmMKEdLUnKzN2Sp_AXXgTt0</recordid><startdate>20240412</startdate><enddate>20240412</enddate><creator>Mei, Junyi</creator><creator>Sun, Shixuan</creator><creator>Li, Chao</creator><creator>Xu, Cheng</creator><creator>Chen, Cheng</creator><creator>Liu, Yibo</creator><creator>Wang, Jing</creator><creator>Zhao, Cheng</creator><creator>Hou, Xiaofeng</creator><creator>Guo, Minyi</creator><creator>He, Bingsheng</creator><creator>Cong, Xiaoliang</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240412</creationdate><title>FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework</title><author>Mei, Junyi ; Sun, Shixuan ; Li, Chao ; Xu, Cheng ; Chen, Cheng ; Liu, Yibo ; Wang, Jing ; Zhao, Cheng ; Hou, Xiaofeng ; Guo, Minyi ; He, Bingsheng ; Cong, Xiaoliang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-126790257a38bc95530d8cd04aa94eb2ab8b4935f7fd6ebbe92d7540fae654643</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Distributed, Parallel, and Cluster Computing</topic><toplevel>online_resources</toplevel><creatorcontrib>Mei, Junyi</creatorcontrib><creatorcontrib>Sun, Shixuan</creatorcontrib><creatorcontrib>Li, Chao</creatorcontrib><creatorcontrib>Xu, Cheng</creatorcontrib><creatorcontrib>Chen, Cheng</creatorcontrib><creatorcontrib>Liu, Yibo</creatorcontrib><creatorcontrib>Wang, Jing</creatorcontrib><creatorcontrib>Zhao, Cheng</creatorcontrib><creatorcontrib>Hou, Xiaofeng</creatorcontrib><creatorcontrib>Guo, Minyi</creatorcontrib><creatorcontrib>He, Bingsheng</creatorcontrib><creatorcontrib>Cong, Xiaoliang</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mei, Junyi</au><au>Sun, Shixuan</au><au>Li, Chao</au><au>Xu, Cheng</au><au>Chen, Cheng</au><au>Liu, Yibo</au><au>Wang, Jing</au><au>Zhao, Cheng</au><au>Hou, Xiaofeng</au><au>Guo, Minyi</au><au>He, Bingsheng</au><au>Cong, Xiaoliang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework</atitle><date>2024-04-12</date><risdate>2024</risdate><abstract>Dynamic graph random walk (DGRW) emerges as a practical tool for capturing structural relations within a graph. Effectively executing DGRW on GPU presents certain challenges. First, existing sampling methods demand a pre-processing buffer, causing substantial space complexity. Moreover, the power-law distribution of graph vertex degrees introduces workload imbalance issues, rendering DGRW embarrassed to parallelize. In this paper, we propose FlowWalker, a GPU-based dynamic graph random walk framework. FlowWalker implements an efficient parallel sampling method to fully exploit the GPU parallelism and reduce space complexity. Moreover, it employs a sampler-centric paradigm alongside a dynamic scheduling strategy to handle the huge amounts of walking queries. FlowWalker stands as a memory-efficient framework that requires no auxiliary data structures in GPU global memory. We examine the performance of FlowWalker extensively on ten datasets, and experiment results show that FlowWalker achieves up to 752.2x, 72.1x, and 16.4x speedup compared with existing CPU, GPU, and FPGA random walk frameworks, respectively. Case study shows that FlowWalker diminishes random walk time from 35% to 3% in a pipeline of ByteDance friend recommendation GNN training.</abstract><doi>10.48550/arxiv.2404.08364</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2404.08364
ispartof
issn
language eng
recordid cdi_arxiv_primary_2404_08364
source arXiv.org
subjects Computer Science - Distributed, Parallel, and Cluster Computing
title FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T13%3A36%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=FlowWalker:%20A%20Memory-efficient%20and%20High-performance%20GPU-based%20Dynamic%20Graph%20Random%20Walk%20Framework&rft.au=Mei,%20Junyi&rft.date=2024-04-12&rft_id=info:doi/10.48550/arxiv.2404.08364&rft_dat=%3Carxiv_GOX%3E2404_08364%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true