PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs
There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory...
Gespeichert in:
Veröffentlicht in: | ACM transactions on architecture and code optimization 2024-12, Vol.21 (4), p.1-25, Article 88 |
---|---|
Hauptverfasser: | , , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 25 |
---|---|
container_issue | 4 |
container_start_page | 1 |
container_title | ACM transactions on architecture and code optimization |
container_volume | 21 |
creator | Mao, Fubing Liu, Xu Zhang, Yu Liu, Haikun Liao, Xiaofei Jin, Hai Zhang, Wei Zhou, Jian Wu, Yufei Nie, Longyu Guo, Yapu Jiang, Zihan Liu, Jingkang |
description | There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory access overhead, and cannot process CGQs in streaming graphs efficiently. To address these issues, we propose PMGraph, a software-hardware collaborative accelerator for efficient processing of CGQs in streaming graphs. First, PMGraph centers on fine-grained data, selects graph queries that meet the requirements through vertex data, and utilizes the similarity between different graph queries to merge the same vertices they need to process to address the problem of a large amount of repeated access to the same data by different graph queries in CGQs, thereby reducing memory access overhead. Furthermore, it adopts the update strategy that regularizes the processing order of vertices in each graph query according to the order of the vertex dependence chain, consequently effectively reducing redundant computations. Second, we propose a CGQs-oriented scheduling strategy to increase the data overlap when different graph queries are processed, thereby further improving the performance. Finally, PMGraph prefetches the vertex information according to the global active vertex set Frontier of all graph queries, hiding the memory access latency. It also provides prefetching for the same vertices that need to be processed by different graph queries, reducing the memory access overhead. Compared with the state-of-the-art concurrent graph query software systems Kickstarter-C and Tripoline, PMGraph achieves average speedups of 5.57× and 4.58×, respectively. Compared with the state-of-the-art hardware accelerators Minnow, HATS, LCCG, and JetStream, PMGraph achieves the speedup of 3.65×, 3.41×, 1.73×, and 1.38× on average, respectively. Experimental results show that our proposed PMGraph outperforms the state-of-the-art concurrent graph processing systems and hardware accelerators. |
doi_str_mv | 10.1145/3689337 |
format | Article |
fullrecord | <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3689337</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3689337</sourcerecordid><originalsourceid>FETCH-LOGICAL-a1207-28e205dd7473c839d60b494d3127ae47ca38f94e228b3105aca23f44ba1c34f3</originalsourceid><addsrcrecordid>eNo9j01LxDAURYMoOI7i3lV2rqpJXtI07oY6jsKIirMvr-mrVqbtkHQE_73Ol6t74RwuXMYupbiRUptbSDMHYI_YSBqtE3AWjg_dpOkpO4vxSwjllBAjdv_6PAu4-rzjE-9pSQGHpvvged_5dQjUDXyL-duaQkOR998U-PsQCNuNt4XxnJ3UuIx0sc8xWzxMF_ljMn-ZPeWTeYJSCZuojJQwVWW1BZ-Bq1JRaqcrkMoiaesRstppUiorQQqDHhXUWpcoPegaxux6N-tDH2OguliFpsXwU0hRbL4X--9_5tXORN_-Swf4C4FnUqE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs</title><source>ACM Digital Library Complete</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Mao, Fubing ; Liu, Xu ; Zhang, Yu ; Liu, Haikun ; Liao, Xiaofei ; Jin, Hai ; Zhang, Wei ; Zhou, Jian ; Wu, Yufei ; Nie, Longyu ; Guo, Yapu ; Jiang, Zihan ; Liu, Jingkang</creator><creatorcontrib>Mao, Fubing ; Liu, Xu ; Zhang, Yu ; Liu, Haikun ; Liao, Xiaofei ; Jin, Hai ; Zhang, Wei ; Zhou, Jian ; Wu, Yufei ; Nie, Longyu ; Guo, Yapu ; Jiang, Zihan ; Liu, Jingkang</creatorcontrib><description>There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory access overhead, and cannot process CGQs in streaming graphs efficiently. To address these issues, we propose PMGraph, a software-hardware collaborative accelerator for efficient processing of CGQs in streaming graphs. First, PMGraph centers on fine-grained data, selects graph queries that meet the requirements through vertex data, and utilizes the similarity between different graph queries to merge the same vertices they need to process to address the problem of a large amount of repeated access to the same data by different graph queries in CGQs, thereby reducing memory access overhead. Furthermore, it adopts the update strategy that regularizes the processing order of vertices in each graph query according to the order of the vertex dependence chain, consequently effectively reducing redundant computations. Second, we propose a CGQs-oriented scheduling strategy to increase the data overlap when different graph queries are processed, thereby further improving the performance. Finally, PMGraph prefetches the vertex information according to the global active vertex set Frontier of all graph queries, hiding the memory access latency. It also provides prefetching for the same vertices that need to be processed by different graph queries, reducing the memory access overhead. Compared with the state-of-the-art concurrent graph query software systems Kickstarter-C and Tripoline, PMGraph achieves average speedups of 5.57× and 4.58×, respectively. Compared with the state-of-the-art hardware accelerators Minnow, HATS, LCCG, and JetStream, PMGraph achieves the speedup of 3.65×, 3.41×, 1.73×, and 1.38× on average, respectively. Experimental results show that our proposed PMGraph outperforms the state-of-the-art concurrent graph processing systems and hardware accelerators.</description><identifier>ISSN: 1544-3566</identifier><identifier>EISSN: 1544-3973</identifier><identifier>DOI: 10.1145/3689337</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Computer systems organization ; Computing methodologies ; Parallel architectures ; Parallel computing methodologies</subject><ispartof>ACM transactions on architecture and code optimization, 2024-12, Vol.21 (4), p.1-25, Article 88</ispartof><rights>Copyright held by the owner/author(s).</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a1207-28e205dd7473c839d60b494d3127ae47ca38f94e228b3105aca23f44ba1c34f3</cites><orcidid>0009-0006-4432-8640 ; 0009-0005-2347-0472 ; 0009-0000-6932-0561 ; 0000-0003-0718-8045 ; 0009-0001-1076-0009 ; 0000-0003-2589-0073 ; 0000-0002-7622-6714 ; 0000-0003-4290-1408 ; 0009-0007-9015-5485 ; 0009-0005-5303-2078 ; 0000-0001-6302-813X ; 0000-0002-3934-7605 ; 0009-0006-2866-6893</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3689337$$EPDF$$P50$$Gacm$$Hfree_for_read</linktopdf><link.rule.ids>314,776,780,2275,27903,27904,40175,75975</link.rule.ids></links><search><creatorcontrib>Mao, Fubing</creatorcontrib><creatorcontrib>Liu, Xu</creatorcontrib><creatorcontrib>Zhang, Yu</creatorcontrib><creatorcontrib>Liu, Haikun</creatorcontrib><creatorcontrib>Liao, Xiaofei</creatorcontrib><creatorcontrib>Jin, Hai</creatorcontrib><creatorcontrib>Zhang, Wei</creatorcontrib><creatorcontrib>Zhou, Jian</creatorcontrib><creatorcontrib>Wu, Yufei</creatorcontrib><creatorcontrib>Nie, Longyu</creatorcontrib><creatorcontrib>Guo, Yapu</creatorcontrib><creatorcontrib>Jiang, Zihan</creatorcontrib><creatorcontrib>Liu, Jingkang</creatorcontrib><title>PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs</title><title>ACM transactions on architecture and code optimization</title><addtitle>ACM TACO</addtitle><description>There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory access overhead, and cannot process CGQs in streaming graphs efficiently. To address these issues, we propose PMGraph, a software-hardware collaborative accelerator for efficient processing of CGQs in streaming graphs. First, PMGraph centers on fine-grained data, selects graph queries that meet the requirements through vertex data, and utilizes the similarity between different graph queries to merge the same vertices they need to process to address the problem of a large amount of repeated access to the same data by different graph queries in CGQs, thereby reducing memory access overhead. Furthermore, it adopts the update strategy that regularizes the processing order of vertices in each graph query according to the order of the vertex dependence chain, consequently effectively reducing redundant computations. Second, we propose a CGQs-oriented scheduling strategy to increase the data overlap when different graph queries are processed, thereby further improving the performance. Finally, PMGraph prefetches the vertex information according to the global active vertex set Frontier of all graph queries, hiding the memory access latency. It also provides prefetching for the same vertices that need to be processed by different graph queries, reducing the memory access overhead. Compared with the state-of-the-art concurrent graph query software systems Kickstarter-C and Tripoline, PMGraph achieves average speedups of 5.57× and 4.58×, respectively. Compared with the state-of-the-art hardware accelerators Minnow, HATS, LCCG, and JetStream, PMGraph achieves the speedup of 3.65×, 3.41×, 1.73×, and 1.38× on average, respectively. Experimental results show that our proposed PMGraph outperforms the state-of-the-art concurrent graph processing systems and hardware accelerators.</description><subject>Computer systems organization</subject><subject>Computing methodologies</subject><subject>Parallel architectures</subject><subject>Parallel computing methodologies</subject><issn>1544-3566</issn><issn>1544-3973</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNo9j01LxDAURYMoOI7i3lV2rqpJXtI07oY6jsKIirMvr-mrVqbtkHQE_73Ol6t74RwuXMYupbiRUptbSDMHYI_YSBqtE3AWjg_dpOkpO4vxSwjllBAjdv_6PAu4-rzjE-9pSQGHpvvged_5dQjUDXyL-duaQkOR998U-PsQCNuNt4XxnJ3UuIx0sc8xWzxMF_ljMn-ZPeWTeYJSCZuojJQwVWW1BZ-Bq1JRaqcrkMoiaesRstppUiorQQqDHhXUWpcoPegaxux6N-tDH2OguliFpsXwU0hRbL4X--9_5tXORN_-Swf4C4FnUqE</recordid><startdate>20241231</startdate><enddate>20241231</enddate><creator>Mao, Fubing</creator><creator>Liu, Xu</creator><creator>Zhang, Yu</creator><creator>Liu, Haikun</creator><creator>Liao, Xiaofei</creator><creator>Jin, Hai</creator><creator>Zhang, Wei</creator><creator>Zhou, Jian</creator><creator>Wu, Yufei</creator><creator>Nie, Longyu</creator><creator>Guo, Yapu</creator><creator>Jiang, Zihan</creator><creator>Liu, Jingkang</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0009-0006-4432-8640</orcidid><orcidid>https://orcid.org/0009-0005-2347-0472</orcidid><orcidid>https://orcid.org/0009-0000-6932-0561</orcidid><orcidid>https://orcid.org/0000-0003-0718-8045</orcidid><orcidid>https://orcid.org/0009-0001-1076-0009</orcidid><orcidid>https://orcid.org/0000-0003-2589-0073</orcidid><orcidid>https://orcid.org/0000-0002-7622-6714</orcidid><orcidid>https://orcid.org/0000-0003-4290-1408</orcidid><orcidid>https://orcid.org/0009-0007-9015-5485</orcidid><orcidid>https://orcid.org/0009-0005-5303-2078</orcidid><orcidid>https://orcid.org/0000-0001-6302-813X</orcidid><orcidid>https://orcid.org/0000-0002-3934-7605</orcidid><orcidid>https://orcid.org/0009-0006-2866-6893</orcidid></search><sort><creationdate>20241231</creationdate><title>PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs</title><author>Mao, Fubing ; Liu, Xu ; Zhang, Yu ; Liu, Haikun ; Liao, Xiaofei ; Jin, Hai ; Zhang, Wei ; Zhou, Jian ; Wu, Yufei ; Nie, Longyu ; Guo, Yapu ; Jiang, Zihan ; Liu, Jingkang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a1207-28e205dd7473c839d60b494d3127ae47ca38f94e228b3105aca23f44ba1c34f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer systems organization</topic><topic>Computing methodologies</topic><topic>Parallel architectures</topic><topic>Parallel computing methodologies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mao, Fubing</creatorcontrib><creatorcontrib>Liu, Xu</creatorcontrib><creatorcontrib>Zhang, Yu</creatorcontrib><creatorcontrib>Liu, Haikun</creatorcontrib><creatorcontrib>Liao, Xiaofei</creatorcontrib><creatorcontrib>Jin, Hai</creatorcontrib><creatorcontrib>Zhang, Wei</creatorcontrib><creatorcontrib>Zhou, Jian</creatorcontrib><creatorcontrib>Wu, Yufei</creatorcontrib><creatorcontrib>Nie, Longyu</creatorcontrib><creatorcontrib>Guo, Yapu</creatorcontrib><creatorcontrib>Jiang, Zihan</creatorcontrib><creatorcontrib>Liu, Jingkang</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on architecture and code optimization</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mao, Fubing</au><au>Liu, Xu</au><au>Zhang, Yu</au><au>Liu, Haikun</au><au>Liao, Xiaofei</au><au>Jin, Hai</au><au>Zhang, Wei</au><au>Zhou, Jian</au><au>Wu, Yufei</au><au>Nie, Longyu</au><au>Guo, Yapu</au><au>Jiang, Zihan</au><au>Liu, Jingkang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs</atitle><jtitle>ACM transactions on architecture and code optimization</jtitle><stitle>ACM TACO</stitle><date>2024-12-31</date><risdate>2024</risdate><volume>21</volume><issue>4</issue><spage>1</spage><epage>25</epage><pages>1-25</pages><artnum>88</artnum><issn>1544-3566</issn><eissn>1544-3973</eissn><abstract>There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory access overhead, and cannot process CGQs in streaming graphs efficiently. To address these issues, we propose PMGraph, a software-hardware collaborative accelerator for efficient processing of CGQs in streaming graphs. First, PMGraph centers on fine-grained data, selects graph queries that meet the requirements through vertex data, and utilizes the similarity between different graph queries to merge the same vertices they need to process to address the problem of a large amount of repeated access to the same data by different graph queries in CGQs, thereby reducing memory access overhead. Furthermore, it adopts the update strategy that regularizes the processing order of vertices in each graph query according to the order of the vertex dependence chain, consequently effectively reducing redundant computations. Second, we propose a CGQs-oriented scheduling strategy to increase the data overlap when different graph queries are processed, thereby further improving the performance. Finally, PMGraph prefetches the vertex information according to the global active vertex set Frontier of all graph queries, hiding the memory access latency. It also provides prefetching for the same vertices that need to be processed by different graph queries, reducing the memory access overhead. Compared with the state-of-the-art concurrent graph query software systems Kickstarter-C and Tripoline, PMGraph achieves average speedups of 5.57× and 4.58×, respectively. Compared with the state-of-the-art hardware accelerators Minnow, HATS, LCCG, and JetStream, PMGraph achieves the speedup of 3.65×, 3.41×, 1.73×, and 1.38× on average, respectively. Experimental results show that our proposed PMGraph outperforms the state-of-the-art concurrent graph processing systems and hardware accelerators.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3689337</doi><tpages>25</tpages><orcidid>https://orcid.org/0009-0006-4432-8640</orcidid><orcidid>https://orcid.org/0009-0005-2347-0472</orcidid><orcidid>https://orcid.org/0009-0000-6932-0561</orcidid><orcidid>https://orcid.org/0000-0003-0718-8045</orcidid><orcidid>https://orcid.org/0009-0001-1076-0009</orcidid><orcidid>https://orcid.org/0000-0003-2589-0073</orcidid><orcidid>https://orcid.org/0000-0002-7622-6714</orcidid><orcidid>https://orcid.org/0000-0003-4290-1408</orcidid><orcidid>https://orcid.org/0009-0007-9015-5485</orcidid><orcidid>https://orcid.org/0009-0005-5303-2078</orcidid><orcidid>https://orcid.org/0000-0001-6302-813X</orcidid><orcidid>https://orcid.org/0000-0002-3934-7605</orcidid><orcidid>https://orcid.org/0009-0006-2866-6893</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1544-3566 |
ispartof | ACM transactions on architecture and code optimization, 2024-12, Vol.21 (4), p.1-25, Article 88 |
issn | 1544-3566 1544-3973 |
language | eng |
recordid | cdi_crossref_primary_10_1145_3689337 |
source | ACM Digital Library Complete; EZB-FREE-00999 freely available EZB journals |
subjects | Computer systems organization Computing methodologies Parallel architectures Parallel computing methodologies |
title | PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T09%3A07%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PMGraph:%20Accelerating%20Concurrent%20Graph%20Queries%20over%20Streaming%20Graphs&rft.jtitle=ACM%20transactions%20on%20architecture%20and%20code%20optimization&rft.au=Mao,%20Fubing&rft.date=2024-12-31&rft.volume=21&rft.issue=4&rft.spage=1&rft.epage=25&rft.pages=1-25&rft.artnum=88&rft.issn=1544-3566&rft.eissn=1544-3973&rft_id=info:doi/10.1145/3689337&rft_dat=%3Cacm_cross%3E3689337%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |