PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs

There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on architecture and code optimization 2024-12, Vol.21 (4), p.1-25, Article 88
Hauptverfasser: Mao, Fubing, Liu, Xu, Zhang, Yu, Liu, Haikun, Liao, Xiaofei, Jin, Hai, Zhang, Wei, Zhou, Jian, Wu, Yufei, Nie, Longyu, Guo, Yapu, Jiang, Zihan, Liu, Jingkang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 25
container_issue 4
container_start_page 1
container_title ACM transactions on architecture and code optimization
container_volume 21
creator Mao, Fubing
Liu, Xu
Zhang, Yu
Liu, Haikun
Liao, Xiaofei
Jin, Hai
Zhang, Wei
Zhou, Jian
Wu, Yufei
Nie, Longyu
Guo, Yapu
Jiang, Zihan
Liu, Jingkang
description There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory access overhead, and cannot process CGQs in streaming graphs efficiently. To address these issues, we propose PMGraph, a software-hardware collaborative accelerator for efficient processing of CGQs in streaming graphs. First, PMGraph centers on fine-grained data, selects graph queries that meet the requirements through vertex data, and utilizes the similarity between different graph queries to merge the same vertices they need to process to address the problem of a large amount of repeated access to the same data by different graph queries in CGQs, thereby reducing memory access overhead. Furthermore, it adopts the update strategy that regularizes the processing order of vertices in each graph query according to the order of the vertex dependence chain, consequently effectively reducing redundant computations. Second, we propose a CGQs-oriented scheduling strategy to increase the data overlap when different graph queries are processed, thereby further improving the performance. Finally, PMGraph prefetches the vertex information according to the global active vertex set Frontier of all graph queries, hiding the memory access latency. It also provides prefetching for the same vertices that need to be processed by different graph queries, reducing the memory access overhead. Compared with the state-of-the-art concurrent graph query software systems Kickstarter-C and Tripoline, PMGraph achieves average speedups of 5.57× and 4.58×, respectively. Compared with the state-of-the-art hardware accelerators Minnow, HATS, LCCG, and JetStream, PMGraph achieves the speedup of 3.65×, 3.41×, 1.73×, and 1.38× on average, respectively. Experimental results show that our proposed PMGraph outperforms the state-of-the-art concurrent graph processing systems and hardware accelerators.
doi_str_mv 10.1145/3689337
format Article
fullrecord <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3689337</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3689337</sourcerecordid><originalsourceid>FETCH-LOGICAL-a1207-28e205dd7473c839d60b494d3127ae47ca38f94e228b3105aca23f44ba1c34f3</originalsourceid><addsrcrecordid>eNo9j01LxDAURYMoOI7i3lV2rqpJXtI07oY6jsKIirMvr-mrVqbtkHQE_73Ol6t74RwuXMYupbiRUptbSDMHYI_YSBqtE3AWjg_dpOkpO4vxSwjllBAjdv_6PAu4-rzjE-9pSQGHpvvged_5dQjUDXyL-duaQkOR998U-PsQCNuNt4XxnJ3UuIx0sc8xWzxMF_ljMn-ZPeWTeYJSCZuojJQwVWW1BZ-Bq1JRaqcrkMoiaesRstppUiorQQqDHhXUWpcoPegaxux6N-tDH2OguliFpsXwU0hRbL4X--9_5tXORN_-Swf4C4FnUqE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs</title><source>ACM Digital Library Complete</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Mao, Fubing ; Liu, Xu ; Zhang, Yu ; Liu, Haikun ; Liao, Xiaofei ; Jin, Hai ; Zhang, Wei ; Zhou, Jian ; Wu, Yufei ; Nie, Longyu ; Guo, Yapu ; Jiang, Zihan ; Liu, Jingkang</creator><creatorcontrib>Mao, Fubing ; Liu, Xu ; Zhang, Yu ; Liu, Haikun ; Liao, Xiaofei ; Jin, Hai ; Zhang, Wei ; Zhou, Jian ; Wu, Yufei ; Nie, Longyu ; Guo, Yapu ; Jiang, Zihan ; Liu, Jingkang</creatorcontrib><description>There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory access overhead, and cannot process CGQs in streaming graphs efficiently. To address these issues, we propose PMGraph, a software-hardware collaborative accelerator for efficient processing of CGQs in streaming graphs. First, PMGraph centers on fine-grained data, selects graph queries that meet the requirements through vertex data, and utilizes the similarity between different graph queries to merge the same vertices they need to process to address the problem of a large amount of repeated access to the same data by different graph queries in CGQs, thereby reducing memory access overhead. Furthermore, it adopts the update strategy that regularizes the processing order of vertices in each graph query according to the order of the vertex dependence chain, consequently effectively reducing redundant computations. Second, we propose a CGQs-oriented scheduling strategy to increase the data overlap when different graph queries are processed, thereby further improving the performance. Finally, PMGraph prefetches the vertex information according to the global active vertex set Frontier of all graph queries, hiding the memory access latency. It also provides prefetching for the same vertices that need to be processed by different graph queries, reducing the memory access overhead. Compared with the state-of-the-art concurrent graph query software systems Kickstarter-C and Tripoline, PMGraph achieves average speedups of 5.57× and 4.58×, respectively. Compared with the state-of-the-art hardware accelerators Minnow, HATS, LCCG, and JetStream, PMGraph achieves the speedup of 3.65×, 3.41×, 1.73×, and 1.38× on average, respectively. Experimental results show that our proposed PMGraph outperforms the state-of-the-art concurrent graph processing systems and hardware accelerators.</description><identifier>ISSN: 1544-3566</identifier><identifier>EISSN: 1544-3973</identifier><identifier>DOI: 10.1145/3689337</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Computer systems organization ; Computing methodologies ; Parallel architectures ; Parallel computing methodologies</subject><ispartof>ACM transactions on architecture and code optimization, 2024-12, Vol.21 (4), p.1-25, Article 88</ispartof><rights>Copyright held by the owner/author(s).</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a1207-28e205dd7473c839d60b494d3127ae47ca38f94e228b3105aca23f44ba1c34f3</cites><orcidid>0009-0006-4432-8640 ; 0009-0005-2347-0472 ; 0009-0000-6932-0561 ; 0000-0003-0718-8045 ; 0009-0001-1076-0009 ; 0000-0003-2589-0073 ; 0000-0002-7622-6714 ; 0000-0003-4290-1408 ; 0009-0007-9015-5485 ; 0009-0005-5303-2078 ; 0000-0001-6302-813X ; 0000-0002-3934-7605 ; 0009-0006-2866-6893</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3689337$$EPDF$$P50$$Gacm$$Hfree_for_read</linktopdf><link.rule.ids>314,776,780,2275,27903,27904,40175,75975</link.rule.ids></links><search><creatorcontrib>Mao, Fubing</creatorcontrib><creatorcontrib>Liu, Xu</creatorcontrib><creatorcontrib>Zhang, Yu</creatorcontrib><creatorcontrib>Liu, Haikun</creatorcontrib><creatorcontrib>Liao, Xiaofei</creatorcontrib><creatorcontrib>Jin, Hai</creatorcontrib><creatorcontrib>Zhang, Wei</creatorcontrib><creatorcontrib>Zhou, Jian</creatorcontrib><creatorcontrib>Wu, Yufei</creatorcontrib><creatorcontrib>Nie, Longyu</creatorcontrib><creatorcontrib>Guo, Yapu</creatorcontrib><creatorcontrib>Jiang, Zihan</creatorcontrib><creatorcontrib>Liu, Jingkang</creatorcontrib><title>PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs</title><title>ACM transactions on architecture and code optimization</title><addtitle>ACM TACO</addtitle><description>There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory access overhead, and cannot process CGQs in streaming graphs efficiently. To address these issues, we propose PMGraph, a software-hardware collaborative accelerator for efficient processing of CGQs in streaming graphs. First, PMGraph centers on fine-grained data, selects graph queries that meet the requirements through vertex data, and utilizes the similarity between different graph queries to merge the same vertices they need to process to address the problem of a large amount of repeated access to the same data by different graph queries in CGQs, thereby reducing memory access overhead. Furthermore, it adopts the update strategy that regularizes the processing order of vertices in each graph query according to the order of the vertex dependence chain, consequently effectively reducing redundant computations. Second, we propose a CGQs-oriented scheduling strategy to increase the data overlap when different graph queries are processed, thereby further improving the performance. Finally, PMGraph prefetches the vertex information according to the global active vertex set Frontier of all graph queries, hiding the memory access latency. It also provides prefetching for the same vertices that need to be processed by different graph queries, reducing the memory access overhead. Compared with the state-of-the-art concurrent graph query software systems Kickstarter-C and Tripoline, PMGraph achieves average speedups of 5.57× and 4.58×, respectively. Compared with the state-of-the-art hardware accelerators Minnow, HATS, LCCG, and JetStream, PMGraph achieves the speedup of 3.65×, 3.41×, 1.73×, and 1.38× on average, respectively. Experimental results show that our proposed PMGraph outperforms the state-of-the-art concurrent graph processing systems and hardware accelerators.</description><subject>Computer systems organization</subject><subject>Computing methodologies</subject><subject>Parallel architectures</subject><subject>Parallel computing methodologies</subject><issn>1544-3566</issn><issn>1544-3973</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNo9j01LxDAURYMoOI7i3lV2rqpJXtI07oY6jsKIirMvr-mrVqbtkHQE_73Ol6t74RwuXMYupbiRUptbSDMHYI_YSBqtE3AWjg_dpOkpO4vxSwjllBAjdv_6PAu4-rzjE-9pSQGHpvvged_5dQjUDXyL-duaQkOR998U-PsQCNuNt4XxnJ3UuIx0sc8xWzxMF_ljMn-ZPeWTeYJSCZuojJQwVWW1BZ-Bq1JRaqcrkMoiaesRstppUiorQQqDHhXUWpcoPegaxux6N-tDH2OguliFpsXwU0hRbL4X--9_5tXORN_-Swf4C4FnUqE</recordid><startdate>20241231</startdate><enddate>20241231</enddate><creator>Mao, Fubing</creator><creator>Liu, Xu</creator><creator>Zhang, Yu</creator><creator>Liu, Haikun</creator><creator>Liao, Xiaofei</creator><creator>Jin, Hai</creator><creator>Zhang, Wei</creator><creator>Zhou, Jian</creator><creator>Wu, Yufei</creator><creator>Nie, Longyu</creator><creator>Guo, Yapu</creator><creator>Jiang, Zihan</creator><creator>Liu, Jingkang</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0009-0006-4432-8640</orcidid><orcidid>https://orcid.org/0009-0005-2347-0472</orcidid><orcidid>https://orcid.org/0009-0000-6932-0561</orcidid><orcidid>https://orcid.org/0000-0003-0718-8045</orcidid><orcidid>https://orcid.org/0009-0001-1076-0009</orcidid><orcidid>https://orcid.org/0000-0003-2589-0073</orcidid><orcidid>https://orcid.org/0000-0002-7622-6714</orcidid><orcidid>https://orcid.org/0000-0003-4290-1408</orcidid><orcidid>https://orcid.org/0009-0007-9015-5485</orcidid><orcidid>https://orcid.org/0009-0005-5303-2078</orcidid><orcidid>https://orcid.org/0000-0001-6302-813X</orcidid><orcidid>https://orcid.org/0000-0002-3934-7605</orcidid><orcidid>https://orcid.org/0009-0006-2866-6893</orcidid></search><sort><creationdate>20241231</creationdate><title>PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs</title><author>Mao, Fubing ; Liu, Xu ; Zhang, Yu ; Liu, Haikun ; Liao, Xiaofei ; Jin, Hai ; Zhang, Wei ; Zhou, Jian ; Wu, Yufei ; Nie, Longyu ; Guo, Yapu ; Jiang, Zihan ; Liu, Jingkang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a1207-28e205dd7473c839d60b494d3127ae47ca38f94e228b3105aca23f44ba1c34f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer systems organization</topic><topic>Computing methodologies</topic><topic>Parallel architectures</topic><topic>Parallel computing methodologies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mao, Fubing</creatorcontrib><creatorcontrib>Liu, Xu</creatorcontrib><creatorcontrib>Zhang, Yu</creatorcontrib><creatorcontrib>Liu, Haikun</creatorcontrib><creatorcontrib>Liao, Xiaofei</creatorcontrib><creatorcontrib>Jin, Hai</creatorcontrib><creatorcontrib>Zhang, Wei</creatorcontrib><creatorcontrib>Zhou, Jian</creatorcontrib><creatorcontrib>Wu, Yufei</creatorcontrib><creatorcontrib>Nie, Longyu</creatorcontrib><creatorcontrib>Guo, Yapu</creatorcontrib><creatorcontrib>Jiang, Zihan</creatorcontrib><creatorcontrib>Liu, Jingkang</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on architecture and code optimization</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mao, Fubing</au><au>Liu, Xu</au><au>Zhang, Yu</au><au>Liu, Haikun</au><au>Liao, Xiaofei</au><au>Jin, Hai</au><au>Zhang, Wei</au><au>Zhou, Jian</au><au>Wu, Yufei</au><au>Nie, Longyu</au><au>Guo, Yapu</au><au>Jiang, Zihan</au><au>Liu, Jingkang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs</atitle><jtitle>ACM transactions on architecture and code optimization</jtitle><stitle>ACM TACO</stitle><date>2024-12-31</date><risdate>2024</risdate><volume>21</volume><issue>4</issue><spage>1</spage><epage>25</epage><pages>1-25</pages><artnum>88</artnum><issn>1544-3566</issn><eissn>1544-3973</eissn><abstract>There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory access overhead, and cannot process CGQs in streaming graphs efficiently. To address these issues, we propose PMGraph, a software-hardware collaborative accelerator for efficient processing of CGQs in streaming graphs. First, PMGraph centers on fine-grained data, selects graph queries that meet the requirements through vertex data, and utilizes the similarity between different graph queries to merge the same vertices they need to process to address the problem of a large amount of repeated access to the same data by different graph queries in CGQs, thereby reducing memory access overhead. Furthermore, it adopts the update strategy that regularizes the processing order of vertices in each graph query according to the order of the vertex dependence chain, consequently effectively reducing redundant computations. Second, we propose a CGQs-oriented scheduling strategy to increase the data overlap when different graph queries are processed, thereby further improving the performance. Finally, PMGraph prefetches the vertex information according to the global active vertex set Frontier of all graph queries, hiding the memory access latency. It also provides prefetching for the same vertices that need to be processed by different graph queries, reducing the memory access overhead. Compared with the state-of-the-art concurrent graph query software systems Kickstarter-C and Tripoline, PMGraph achieves average speedups of 5.57× and 4.58×, respectively. Compared with the state-of-the-art hardware accelerators Minnow, HATS, LCCG, and JetStream, PMGraph achieves the speedup of 3.65×, 3.41×, 1.73×, and 1.38× on average, respectively. Experimental results show that our proposed PMGraph outperforms the state-of-the-art concurrent graph processing systems and hardware accelerators.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3689337</doi><tpages>25</tpages><orcidid>https://orcid.org/0009-0006-4432-8640</orcidid><orcidid>https://orcid.org/0009-0005-2347-0472</orcidid><orcidid>https://orcid.org/0009-0000-6932-0561</orcidid><orcidid>https://orcid.org/0000-0003-0718-8045</orcidid><orcidid>https://orcid.org/0009-0001-1076-0009</orcidid><orcidid>https://orcid.org/0000-0003-2589-0073</orcidid><orcidid>https://orcid.org/0000-0002-7622-6714</orcidid><orcidid>https://orcid.org/0000-0003-4290-1408</orcidid><orcidid>https://orcid.org/0009-0007-9015-5485</orcidid><orcidid>https://orcid.org/0009-0005-5303-2078</orcidid><orcidid>https://orcid.org/0000-0001-6302-813X</orcidid><orcidid>https://orcid.org/0000-0002-3934-7605</orcidid><orcidid>https://orcid.org/0009-0006-2866-6893</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1544-3566
ispartof ACM transactions on architecture and code optimization, 2024-12, Vol.21 (4), p.1-25, Article 88
issn 1544-3566
1544-3973
language eng
recordid cdi_crossref_primary_10_1145_3689337
source ACM Digital Library Complete; EZB-FREE-00999 freely available EZB journals
subjects Computer systems organization
Computing methodologies
Parallel architectures
Parallel computing methodologies
title PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T09%3A07%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PMGraph:%20Accelerating%20Concurrent%20Graph%20Queries%20over%20Streaming%20Graphs&rft.jtitle=ACM%20transactions%20on%20architecture%20and%20code%20optimization&rft.au=Mao,%20Fubing&rft.date=2024-12-31&rft.volume=21&rft.issue=4&rft.spage=1&rft.epage=25&rft.pages=1-25&rft.artnum=88&rft.issn=1544-3566&rft.eissn=1544-3973&rft_id=info:doi/10.1145/3689337&rft_dat=%3Cacm_cross%3E3689337%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true