PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs

There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on architecture and code optimization 2024-12, Vol.21 (4), p.1-25, Article 88
Hauptverfasser:	Mao, Fubing, Liu, Xu, Zhang, Yu, Liu, Haikun, Liao, Xiaofei, Jin, Hai, Zhang, Wei, Zhou, Jian, Wu, Yufei, Nie, Longyu, Guo, Yapu, Jiang, Zihan, Liu, Jingkang
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer systems organization Computing methodologies Parallel architectures Parallel computing methodologies
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	25
container_issue	4
container_start_page	1
container_title	ACM transactions on architecture and code optimization
container_volume	21
creator	Mao, Fubing Liu, Xu Zhang, Yu Liu, Haikun Liao, Xiaofei Jin, Hai Zhang, Wei Zhou, Jian Wu, Yufei Nie, Longyu Guo, Yapu Jiang, Zihan Liu, Jingkang
description	There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory access overhead, and cannot process CGQs in streaming graphs efficiently. To address these issues, we propose PMGraph, a software-hardware collaborative accelerator for efficient processing of CGQs in streaming graphs. First, PMGraph centers on fine-grained data, selects graph queries that meet the requirements through vertex data, and utilizes the similarity between different graph queries to merge the same vertices they need to process to address the problem of a large amount of repeated access to the same data by different graph queries in CGQs, thereby reducing memory access overhead. Furthermore, it adopts the update strategy that regularizes the processing order of vertices in each graph query according to the order of the vertex dependence chain, consequently effectively reducing redundant computations. Second, we propose a CGQs-oriented scheduling strategy to increase the data overlap when different graph queries are processed, thereby further improving the performance. Finally, PMGraph prefetches the vertex information according to the global active vertex set Frontier of all graph queries, hiding the memory access latency. It also provides prefetching for the same vertices that need to be processed by different graph queries, reducing the memory access overhead. Compared with the state-of-the-art concurrent graph query software systems Kickstarter-C and Tripoline, PMGraph achieves average speedups of 5.57× and 4.58×, respectively. Compared with the state-of-the-art hardware accelerators Minnow, HATS, LCCG, and JetStream, PMGraph achieves the speedup of 3.65×, 3.41×, 1.73×, and 1.38× on average, respectively. Experimental results show that our proposed PMGraph outperforms the state-of-the-art concurrent graph processing systems and hardware accelerators.
doi_str_mv	10.1145/3689337
format	Article
fullrecord	<record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3689337</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3689337</sourcerecordid><originalsourceid>FETCH-LOGICAL-a1207-28e205dd7473c839d60b494d3127ae47ca38f94e228b3105aca23f44ba1c34f3</originalsourceid><addsrcrecordid>eNo9j01LxDAURYMoOI7i3lV2rqpJXtI07oY6jsKIirMvr-mrVqbtkHQE_73Ol6t74RwuXMYupbiRUptbSDMHYI_YSBqtE3AWjg_dpOkpO4vxSwjllBAjdv_6PAu4-rzjE-9pSQGHpvvged_5dQjUDXyL-duaQkOR998U-PsQCNuNt4XxnJ3UuIx0sc8xWzxMF_ljMn-ZPeWTeYJSCZuojJQwVWW1BZ-Bq1JRaqcrkMoiaesRstppUiorQQqDHhXUWpcoPegaxux6N-tDH2OguliFpsXwU0hRbL4X--9_5tXORN_-Swf4C4FnUqE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs</title><source>ACM Digital Library Complete</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Mao, Fubing ; Liu, Xu ; Zhang, Yu ; Liu, Haikun ; Liao, Xiaofei ; Jin, Hai ; Zhang, Wei ; Zhou, Jian ; Wu, Yufei ; Nie, Longyu ; Guo, Yapu ; Jiang, Zihan ; Liu, Jingkang</creator><creatorcontrib>Mao, Fubing ; Liu, Xu ; Zhang, Yu ; Liu, Haikun ; Liao, Xiaofei ; Jin, Hai ; Zhang, Wei ; Zhou, Jian ; Wu, Yufei ; Nie, Longyu ; Guo, Yapu ; Jiang, Zihan ; Liu, Jingkang</creatorcontrib><description>There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory access overhead, and cannot process CGQs in streaming graphs efficiently. To address these issues, we propose PMGraph, a software-hardware collaborative accelerator for efficient processing of CGQs in streaming graphs. First, PMGraph centers on fine-grained data, selects graph queries that meet the requirements through vertex data, and utilizes the similarity between different graph queries to merge the same vertices they need to process to address the problem of a large amount of repeated access to the same data by different graph queries in CGQs, thereby reducing memory access overhead. Furthermore, it adopts the update strategy that regularizes the processing order of vertices in each graph query according to the order of the vertex dependence chain, consequently effectively reducing redundant computations. Second, we propose a CGQs-oriented scheduling strategy to increase the data overlap when different graph queries are processed, thereby further improving the performance. Finally, PMGraph prefetches the vertex information according to the global active vertex set Frontier of all graph queries, hiding the memory access latency. It also provides prefetching for the same vertices that need to be processed by different graph queries, reducing the memory access overhead. Compared with the state-of-the-art concurrent graph query software systems Kickstarter-C and Tripoline, PMGraph achieves average speedups of 5.57× and 4.58×, respectively. Compared with the state-of-the-art hardware accelerators Minnow, HATS, LCCG, and JetStream, PMGraph achieves the speedup of 3.65×, 3.41×, 1.73×, and 1.38× on average, respectively. Experimental results show that our proposed PMGraph outperforms the state-of-the-art concurrent graph processing systems and hardware accelerators.</description><identifier>ISSN: 1544-3566</identifier><identifier>EISSN: 1544-3973</identifier><identifier>DOI: 10.1145/3689337</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Computer systems organization ; Computing methodologies ; Parallel architectures ; Parallel computing methodologies</subject><ispartof>ACM transactions on architecture and code optimization, 2024-12, Vol.21 (4), p.1-25, Article 88</ispartof><rights>Copyright held by the owner/author(s).</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a1207-28e205dd7473c839d60b494d3127ae47ca38f94e228b3105aca23f44ba1c34f3</cites><orcidid>0009-0006-4432-8640 ; 0009-0005-2347-0472 ; 0009-0000-6932-0561 ; 0000-0003-0718-8045 ; 0009-0001-1076-0009 ; 0000-0003-2589-0073 ; 0000-0002-7622-6714 ; 0000-0003-4290-1408 ; 0009-0007-9015-5485 ; 0009-0005-5303-2078 ; 0000-0001-6302-813X ; 0000-0002-3934-7605 ; 0009-0006-2866-6893</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3689337$$EPDF$$P50$$Gacm$$Hfree_for_read</linktopdf><link.rule.ids>314,776,780,2275,27903,27904,40175,75975</link.rule.ids></links><search><creatorcontrib>Mao, Fubing</creatorcontrib><creatorcontrib>Liu, Xu</creatorcontrib><creatorcontrib>Zhang, Yu</creatorcontrib><creatorcontrib>Liu, Haikun</creatorcontrib><creatorcontrib>Liao, Xiaofei</creatorcontrib><creatorcontrib>Jin, Hai</creatorcontrib><creatorcontrib>Zhang, Wei</creatorcontrib><creatorcontrib>Zhou, Jian</creatorcontrib><creatorcontrib>Wu, Yufei</creatorcontrib><creatorcontrib>Nie, Longyu</creatorcontrib><creatorcontrib>Guo, Yapu</creatorcontrib><creatorcontrib>Jiang, Zihan</creatorcontrib><creatorcontrib>Liu, Jingkang</creatorcontrib><title>PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs</title><title>ACM transactions on architecture and code optimization</title><addtitle>ACM TACO</addtitle><description>There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory access overhead, and cannot process CGQs in streaming graphs efficiently. To address these issues, we propose PMGraph, a software-hardware collaborative accelerator for efficient processing of CGQs in streaming graphs. First, PMGraph centers on fine-grained data, selects graph queries that meet the requirements through vertex data, and utilizes the similarity between different graph queries to merge the same vertices they need to process to address the problem of a large amount of repeated access to the same data by different graph queries in CGQs, thereby reducing memory access overhead. Furthermore, it adopts the update strategy that regularizes the processing order of vertices in each graph query according to the order of the vertex dependence chain, consequently effectively reducing redundant computations. Second, we propose a CGQs-oriented scheduling strategy to increase the data overlap when different graph queries are processed, thereby further improving the performance. Finally, PMGraph prefetches the vertex information according to the global active vertex set Frontier of all graph queries, hiding the memory access latency. It also provides prefetching for the same vertices that need to be processed by different graph queries, reducing the memory access overhead. Compared with the state-of-the-art concurrent graph query software systems Kickstarter-C and Tripoline, PMGraph achieves average speedups of 5.57× and 4.58×, respectively. Compared with the state-of-the-art hardware accelerators Minnow, HATS, LCCG, and JetStream, PMGraph achieves the speedup of 3.65×, 3.41×, 1.73×, and 1.38× on average, respectively. Experimental results show that our proposed PMGraph outperforms the state-of-the-art concurrent graph processing systems and hardware accelerators.</description><subject>Computer systems organization</subject><subject>Computing methodologies</subject><subject>Parallel architectures</subject><subject>Parallel computing methodologies</subject><issn>1544-3566</issn><issn>1544-3973</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNo9j01LxDAURYMoOI7i3lV2rqpJXtI07oY6jsKIirMvr-mrVqbtkHQE_73Ol6t74RwuXMYupbiRUptbSDMHYI_YSBqtE3AWjg_dpOkpO4vxSwjllBAjdv_6PAu4-rzjE-9pSQGHpvvged_5dQjUDXyL-duaQkOR998U-PsQCNuNt4XxnJ3UuIx0sc8xWzxMF_ljMn-ZPeWTeYJSCZuojJQwVWW1BZ-Bq1JRaqcrkMoiaesRstppUiorQQqDHhXUWpcoPegaxux6N-tDH2OguliFpsXwU0hRbL4X--9_5tXORN_-Swf4C4FnUqE</recordid><startdate>20241231</startdate><enddate>20241231</enddate><creator>Mao, Fubing</creator><creator>Liu, Xu</creator><creator>Zhang, Yu</creator><creator>Liu, Haikun</creator><creator>Liao, Xiaofei</creator><creator>Jin, Hai</creator><creator>Zhang, Wei</creator><creator>Zhou, Jian</creator><creator>Wu, Yufei</creator><creator>Nie, Longyu</creator><creator>Guo, Yapu</creator><creator>Jiang, Zihan</creator><creator>Liu, Jingkang</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0009-0006-4432-8640</orcidid><orcidid>https://orcid.org/0009-0005-2347-0472</orcidid><orcidid>https://orcid.org/0009-0000-6932-0561</orcidid><orcidid>https://orcid.org/0000-0003-0718-8045</orcidid><orcidid>https://orcid.org/0009-0001-1076-0009</orcidid><orcidid>https://orcid.org/0000-0003-2589-0073</orcidid><orcidid>https://orcid.org/0000-0002-7622-6714</orcidid><orcidid>https://orcid.org/0000-0003-4290-1408</orcidid><orcidid>https://orcid.org/0009-0007-9015-5485</orcidid><orcidid>https://orcid.org/0009-0005-5303-2078</orcidid><orcidid>https://orcid.org/0000-0001-6302-813X</orcidid><orcidid>https://orcid.org/0000-0002-3934-7605</orcidid><orcidid>https://orcid.org/0009-0006-2866-6893</orcidid></search><sort><creationdate>20241231</creationdate><title>PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs</title><author>Mao, Fubing ; Liu, Xu ; Zhang, Yu ; Liu, Haikun ; Liao, Xiaofei ; Jin, Hai ; Zhang, Wei ; Zhou, Jian ; Wu, Yufei ; Nie, Longyu ; Guo, Yapu ; Jiang, Zihan ; Liu, Jingkang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a1207-28e205dd7473c839d60b494d3127ae47ca38f94e228b3105aca23f44ba1c34f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer systems organization</topic><topic>Computing methodologies</topic><topic>Parallel architectures</topic><topic>Parallel computing methodologies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mao, Fubing</creatorcontrib><creatorcontrib>Liu, Xu</creatorcontrib><creatorcontrib>Zhang, Yu</creatorcontrib><creatorcontrib>Liu, Haikun</creatorcontrib><creatorcontrib>Liao, Xiaofei</creatorcontrib><creatorcontrib>Jin, Hai</creatorcontrib><creatorcontrib>Zhang, Wei</creatorcontrib><creatorcontrib>Zhou, Jian</creatorcontrib><creatorcontrib>Wu, Yufei</creatorcontrib><creatorcontrib>Nie, Longyu</creatorcontrib><creatorcontrib>Guo, Yapu</creatorcontrib><creatorcontrib>Jiang, Zihan</creatorcontrib><creatorcontrib>Liu, Jingkang</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on architecture and code optimization</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mao, Fubing</au><au>Liu, Xu</au><au>Zhang, Yu</au><au>Liu, Haikun</au><au>Liao, Xiaofei</au><au>Jin, Hai</au><au>Zhang, Wei</au><au>Zhou, Jian</au><au>Wu, Yufei</au><au>Nie, Longyu</au><au>Guo, Yapu</au><au>Jiang, Zihan</au><au>Liu, Jingkang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs</atitle><jtitle>ACM transactions on architecture and code optimization</jtitle><stitle>ACM TACO</stitle><date>2024-12-31</date><risdate>2024</risdate><volume>21</volume><issue>4</issue><spage>1</spage><epage>25</epage><pages>1-25</pages><artnum>88</artnum><issn>1544-3566</issn><eissn>1544-3973</eissn><abstract>There are usually a large number of concurrent graph queries (CGQs) requirements in streaming graphs. However, existing graph processing systems mainly optimize a single graph query in streaming graphs or CGQs in static graphs. They have a large number of redundant computations and expensive memory access overhead, and cannot process CGQs in streaming graphs efficiently. To address these issues, we propose PMGraph, a software-hardware collaborative accelerator for efficient processing of CGQs in streaming graphs. First, PMGraph centers on fine-grained data, selects graph queries that meet the requirements through vertex data, and utilizes the similarity between different graph queries to merge the same vertices they need to process to address the problem of a large amount of repeated access to the same data by different graph queries in CGQs, thereby reducing memory access overhead. Furthermore, it adopts the update strategy that regularizes the processing order of vertices in each graph query according to the order of the vertex dependence chain, consequently effectively reducing redundant computations. Second, we propose a CGQs-oriented scheduling strategy to increase the data overlap when different graph queries are processed, thereby further improving the performance. Finally, PMGraph prefetches the vertex information according to the global active vertex set Frontier of all graph queries, hiding the memory access latency. It also provides prefetching for the same vertices that need to be processed by different graph queries, reducing the memory access overhead. Compared with the state-of-the-art concurrent graph query software systems Kickstarter-C and Tripoline, PMGraph achieves average speedups of 5.57× and 4.58×, respectively. Compared with the state-of-the-art hardware accelerators Minnow, HATS, LCCG, and JetStream, PMGraph achieves the speedup of 3.65×, 3.41×, 1.73×, and 1.38× on average, respectively. Experimental results show that our proposed PMGraph outperforms the state-of-the-art concurrent graph processing systems and hardware accelerators.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3689337</doi><tpages>25</tpages><orcidid>https://orcid.org/0009-0006-4432-8640</orcidid><orcidid>https://orcid.org/0009-0005-2347-0472</orcidid><orcidid>https://orcid.org/0009-0000-6932-0561</orcidid><orcidid>https://orcid.org/0000-0003-0718-8045</orcidid><orcidid>https://orcid.org/0009-0001-1076-0009</orcidid><orcidid>https://orcid.org/0000-0003-2589-0073</orcidid><orcidid>https://orcid.org/0000-0002-7622-6714</orcidid><orcidid>https://orcid.org/0000-0003-4290-1408</orcidid><orcidid>https://orcid.org/0009-0007-9015-5485</orcidid><orcidid>https://orcid.org/0009-0005-5303-2078</orcidid><orcidid>https://orcid.org/0000-0001-6302-813X</orcidid><orcidid>https://orcid.org/0000-0002-3934-7605</orcidid><orcidid>https://orcid.org/0009-0006-2866-6893</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1544-3566
ispartof	ACM transactions on architecture and code optimization, 2024-12, Vol.21 (4), p.1-25, Article 88
issn	1544-3566 1544-3973
language	eng
recordid	cdi_crossref_primary_10_1145_3689337
source	ACM Digital Library Complete; EZB-FREE-00999 freely available EZB journals
subjects	Computer systems organization Computing methodologies Parallel architectures Parallel computing methodologies
title	PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T09%3A07%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PMGraph:%20Accelerating%20Concurrent%20Graph%20Queries%20over%20Streaming%20Graphs&rft.jtitle=ACM%20transactions%20on%20architecture%20and%20code%20optimization&rft.au=Mao,%20Fubing&rft.date=2024-12-31&rft.volume=21&rft.issue=4&rft.spage=1&rft.epage=25&rft.pages=1-25&rft.artnum=88&rft.issn=1544-3566&rft.eissn=1544-3973&rft_id=info:doi/10.1145/3689337&rft_dat=%3Cacm_cross%3E3689337%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true