Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems

The increasing scale and wealth of inter-connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable knowledge from large-scale graphs. However, real-world graphs are famously difficult to process efficientl...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Gharaibeh, Abdullah, Reza, Tahsin, Santos-Neto, Elizeu, Costa, Lauro Beltrao, Sallinen, Scott, Ripeanu, Matei
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Distributed, Parallel, and Cluster Computing
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Gharaibeh, Abdullah Reza, Tahsin Santos-Neto, Elizeu Costa, Lauro Beltrao Sallinen, Scott Ripeanu, Matei
description	The increasing scale and wealth of inter-connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable knowledge from large-scale graphs. However, real-world graphs are famously difficult to process efficiently. Not only they have a large memory footprint, but also most graph algorithms entail memory access patterns with poor locality, data-dependent parallelism and a low compute-to-memory access ratio. Moreover, most real-world graphs have a highly heterogeneous node degree distribution, hence partitioning these graphs for parallel processing and simultaneously achieving access locality and load-balancing is difficult. This work starts from the hypothesis that hybrid platforms (e.g., GPU-accelerated systems) have both the potential to cope with the heterogeneous structure of real graphs and to offer a cost-effective platform for high-performance graph processing. This work assesses this hypothesis and presents an extensive exploration of the opportunity to harness hybrid systems to process large-scale graphs efficiently. In particular, (i) we present a performance model that estimates the achievable performance on hybrid platforms; (ii) informed by the performance model, we design and develop TOTEM - a processing engine that provides a convenient environment to implement graph algorithms on hybrid platforms; (iii) we show that further performance gains can be extracted using partitioning strategies that aim to produce partitions that each matches the strengths of the processing element it is allocated to, finally, (iv) we demonstrate the performance advantages of the hybrid system through a comprehensive evaluation that uses real and synthetic workloads (as large as 16 billion edges), multiple graph algorithms that stress the system in various ways, and a variety of hardware configurations.
doi_str_mv	10.48550/arxiv.1312.3018
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1312_3018</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1312_3018</sourcerecordid><originalsourceid>FETCH-LOGICAL-a658-ada1c4fae1202cd010935f07a46f40cf7bab78533815f54d72a2d15624b99dfb3</originalsourceid><addsrcrecordid>eNotz8tqwkAUgOHZuCjafVdlXiBxrslkWaKNhYCCdh3OXI4d0CgzUpq3b227-nc_fIQ8cVYqozVbQvqKnyWXXJSScfNAVmvE6GIYb7SHdAzF3sEp0C7B9YPu0sWFnON4pJeRbiaboqft7p3C6Gn30_2Ub-GcF2SGcMrh8b9zcnhdH9pN0W-7t_alL6DSpgAP3CmEwAUTzjPOGqmR1aAqVMxhbcHWRktpuEatfC1AeK4roWzTeLRyTp7_tr-K4ZriGdI03DXDXSO_AQC2Q2U</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems</title><source>arXiv.org</source><creator>Gharaibeh, Abdullah ; Reza, Tahsin ; Santos-Neto, Elizeu ; Costa, Lauro Beltrao ; Sallinen, Scott ; Ripeanu, Matei</creator><creatorcontrib>Gharaibeh, Abdullah ; Reza, Tahsin ; Santos-Neto, Elizeu ; Costa, Lauro Beltrao ; Sallinen, Scott ; Ripeanu, Matei</creatorcontrib><description>The increasing scale and wealth of inter-connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable knowledge from large-scale graphs. However, real-world graphs are famously difficult to process efficiently. Not only they have a large memory footprint, but also most graph algorithms entail memory access patterns with poor locality, data-dependent parallelism and a low compute-to-memory access ratio. Moreover, most real-world graphs have a highly heterogeneous node degree distribution, hence partitioning these graphs for parallel processing and simultaneously achieving access locality and load-balancing is difficult. This work starts from the hypothesis that hybrid platforms (e.g., GPU-accelerated systems) have both the potential to cope with the heterogeneous structure of real graphs and to offer a cost-effective platform for high-performance graph processing. This work assesses this hypothesis and presents an extensive exploration of the opportunity to harness hybrid systems to process large-scale graphs efficiently. In particular, (i) we present a performance model that estimates the achievable performance on hybrid platforms; (ii) informed by the performance model, we design and develop TOTEM - a processing engine that provides a convenient environment to implement graph algorithms on hybrid platforms; (iii) we show that further performance gains can be extracted using partitioning strategies that aim to produce partitions that each matches the strengths of the processing element it is allocated to, finally, (iv) we demonstrate the performance advantages of the hybrid system through a comprehensive evaluation that uses real and synthetic workloads (as large as 16 billion edges), multiple graph algorithms that stress the system in various ways, and a variety of hardware configurations.</description><identifier>DOI: 10.48550/arxiv.1312.3018</identifier><language>eng</language><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><creationdate>2013-12</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1312.3018$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1312.3018$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Gharaibeh, Abdullah</creatorcontrib><creatorcontrib>Reza, Tahsin</creatorcontrib><creatorcontrib>Santos-Neto, Elizeu</creatorcontrib><creatorcontrib>Costa, Lauro Beltrao</creatorcontrib><creatorcontrib>Sallinen, Scott</creatorcontrib><creatorcontrib>Ripeanu, Matei</creatorcontrib><title>Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems</title><description>The increasing scale and wealth of inter-connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable knowledge from large-scale graphs. However, real-world graphs are famously difficult to process efficiently. Not only they have a large memory footprint, but also most graph algorithms entail memory access patterns with poor locality, data-dependent parallelism and a low compute-to-memory access ratio. Moreover, most real-world graphs have a highly heterogeneous node degree distribution, hence partitioning these graphs for parallel processing and simultaneously achieving access locality and load-balancing is difficult. This work starts from the hypothesis that hybrid platforms (e.g., GPU-accelerated systems) have both the potential to cope with the heterogeneous structure of real graphs and to offer a cost-effective platform for high-performance graph processing. This work assesses this hypothesis and presents an extensive exploration of the opportunity to harness hybrid systems to process large-scale graphs efficiently. In particular, (i) we present a performance model that estimates the achievable performance on hybrid platforms; (ii) informed by the performance model, we design and develop TOTEM - a processing engine that provides a convenient environment to implement graph algorithms on hybrid platforms; (iii) we show that further performance gains can be extracted using partitioning strategies that aim to produce partitions that each matches the strengths of the processing element it is allocated to, finally, (iv) we demonstrate the performance advantages of the hybrid system through a comprehensive evaluation that uses real and synthetic workloads (as large as 16 billion edges), multiple graph algorithms that stress the system in various ways, and a variety of hardware configurations.</description><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz8tqwkAUgOHZuCjafVdlXiBxrslkWaKNhYCCdh3OXI4d0CgzUpq3b227-nc_fIQ8cVYqozVbQvqKnyWXXJSScfNAVmvE6GIYb7SHdAzF3sEp0C7B9YPu0sWFnON4pJeRbiaboqft7p3C6Gn30_2Ub-GcF2SGcMrh8b9zcnhdH9pN0W-7t_alL6DSpgAP3CmEwAUTzjPOGqmR1aAqVMxhbcHWRktpuEatfC1AeK4roWzTeLRyTp7_tr-K4ZriGdI03DXDXSO_AQC2Q2U</recordid><startdate>20131210</startdate><enddate>20131210</enddate><creator>Gharaibeh, Abdullah</creator><creator>Reza, Tahsin</creator><creator>Santos-Neto, Elizeu</creator><creator>Costa, Lauro Beltrao</creator><creator>Sallinen, Scott</creator><creator>Ripeanu, Matei</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20131210</creationdate><title>Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems</title><author>Gharaibeh, Abdullah ; Reza, Tahsin ; Santos-Neto, Elizeu ; Costa, Lauro Beltrao ; Sallinen, Scott ; Ripeanu, Matei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a658-ada1c4fae1202cd010935f07a46f40cf7bab78533815f54d72a2d15624b99dfb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Computer Science - Distributed, Parallel, and Cluster Computing</topic><toplevel>online_resources</toplevel><creatorcontrib>Gharaibeh, Abdullah</creatorcontrib><creatorcontrib>Reza, Tahsin</creatorcontrib><creatorcontrib>Santos-Neto, Elizeu</creatorcontrib><creatorcontrib>Costa, Lauro Beltrao</creatorcontrib><creatorcontrib>Sallinen, Scott</creatorcontrib><creatorcontrib>Ripeanu, Matei</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gharaibeh, Abdullah</au><au>Reza, Tahsin</au><au>Santos-Neto, Elizeu</au><au>Costa, Lauro Beltrao</au><au>Sallinen, Scott</au><au>Ripeanu, Matei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems</atitle><date>2013-12-10</date><risdate>2013</risdate><abstract>The increasing scale and wealth of inter-connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable knowledge from large-scale graphs. However, real-world graphs are famously difficult to process efficiently. Not only they have a large memory footprint, but also most graph algorithms entail memory access patterns with poor locality, data-dependent parallelism and a low compute-to-memory access ratio. Moreover, most real-world graphs have a highly heterogeneous node degree distribution, hence partitioning these graphs for parallel processing and simultaneously achieving access locality and load-balancing is difficult. This work starts from the hypothesis that hybrid platforms (e.g., GPU-accelerated systems) have both the potential to cope with the heterogeneous structure of real graphs and to offer a cost-effective platform for high-performance graph processing. This work assesses this hypothesis and presents an extensive exploration of the opportunity to harness hybrid systems to process large-scale graphs efficiently. In particular, (i) we present a performance model that estimates the achievable performance on hybrid platforms; (ii) informed by the performance model, we design and develop TOTEM - a processing engine that provides a convenient environment to implement graph algorithms on hybrid platforms; (iii) we show that further performance gains can be extracted using partitioning strategies that aim to produce partitions that each matches the strengths of the processing element it is allocated to, finally, (iv) we demonstrate the performance advantages of the hybrid system through a comprehensive evaluation that uses real and synthetic workloads (as large as 16 billion edges), multiple graph algorithms that stress the system in various ways, and a variety of hardware configurations.</abstract><doi>10.48550/arxiv.1312.3018</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.1312.3018
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_1312_3018
source	arXiv.org
subjects	Computer Science - Distributed, Parallel, and Cluster Computing
title	Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T03%3A44%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Efficient%20Large-Scale%20Graph%20Processing%20on%20Hybrid%20CPU%20and%20GPU%20Systems&rft.au=Gharaibeh,%20Abdullah&rft.date=2013-12-10&rft_id=info:doi/10.48550/arxiv.1312.3018&rft_dat=%3Carxiv_GOX%3E1312_3018%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true