Porting hypre to heterogeneous computer architectures: Strategies and experiences

Linear systems are occurring in many applications, and solving them can take a large amount of the total simulation time. The high performance library hypre provides a variety of interfaces and linear solvers, including various multigrid methods, that have achieved good scalability on a variety of h...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Parallel computing 2021-12, Vol.108 (C), p.102840, Article 102840
Hauptverfasser:	Falgout, Robert D., Li, Ruipeng, Sjögreen, Björn, Wang, Lu, Yang, Ulrike Meier
Format:	Artikel
Sprache:	eng
Schlagworte:	GPU computing HPC Iterative methods Multigrid methods Preconditioning techniques
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	C
container_start_page	102840
container_title	Parallel computing
container_volume	108
creator	Falgout, Robert D. Li, Ruipeng Sjögreen, Björn Wang, Lu Yang, Ulrike Meier
description	Linear systems are occurring in many applications, and solving them can take a large amount of the total simulation time. The high performance library hypre provides a variety of interfaces and linear solvers, including various multigrid methods, that have achieved good scalability on a variety of homogeneous parallel computer architectures. Heterogeneous architectures with nodes that have both CPUs and accelerators provide new challenges, since they require more fine-grained parallelism and reduced data movement between different memories on a single node as well as across nodes. We will discuss our experiences and strategies to port hypre to heterogeneous computers with accelerators, including the design of a new memory model, the use of abstractions, the BoxLoop macros in the structured and semi-structured interfaces, and the restructuring of algebraic multigrid (AMG) into modular components. We present numerical experiments comparing CPU and GPU performance for several test problems. •Software effort on porting the hypre library to accelerator-based architectures.•Models of memory management and execution policy on heterogeneous platforms.•Portability strategy for structured interface, loop abstractions, and solvers.•Strategy to enable the unstructured interface, preconditioners, and solvers on GPUs.•Performance study of comparing various multigrid solvers running on CPUs and GPUs.
doi_str_mv	10.1016/j.parco.2021.102840
format	Article
fullrecord	<record><control><sourceid>elsevier_osti_</sourceid><recordid>TN_cdi_osti_scitechconnect_1823481</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167819121000867</els_id><sourcerecordid>S0167819121000867</sourcerecordid><originalsourceid>FETCH-LOGICAL-c375t-c058b9b2095b2be229ca1cadd873f9bede45e8f224c7d8a5eb7180d6745a73653</originalsourceid><addsrcrecordid>eNp9kEtLxDAUhYMoOI7-AjfBfcc82iYVXMjgCwQVdR3S5HYmg9OUJCPOvze1rl1dOJxzOPdD6JySBSW0vtwsBh2MXzDCaFaYLMkBmlEpWCE4rw_RLLtEIWlDj9FJjBtCSF1KMkOvLz4k16_wej8EwMnjNSQIfgU9-F3Exm-HXRZw7l-7BCbtAsQr_JaCTrByELHuLYbvAYKD3kA8RUed_oxw9nfn6OPu9n35UDw93z8ub54Kw0WVCkMq2TYtI03VshYYa4ymRlsrBe-aFiyUFciOsdIIK3UFraCS2FqUlRa8rvgcXUy9PianohnHrY3v-7xRUcl4KWk28clkgo8xQKeG4LY67BUlakSnNuoXnRrRqQldTl1PKcj7vxyEsX58zrowtlvv_s3_AEtuee4</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Porting hypre to heterogeneous computer architectures: Strategies and experiences</title><source>Access via ScienceDirect (Elsevier)</source><creator>Falgout, Robert D. ; Li, Ruipeng ; Sjögreen, Björn ; Wang, Lu ; Yang, Ulrike Meier</creator><creatorcontrib>Falgout, Robert D. ; Li, Ruipeng ; Sjögreen, Björn ; Wang, Lu ; Yang, Ulrike Meier</creatorcontrib><description>Linear systems are occurring in many applications, and solving them can take a large amount of the total simulation time. The high performance library hypre provides a variety of interfaces and linear solvers, including various multigrid methods, that have achieved good scalability on a variety of homogeneous parallel computer architectures. Heterogeneous architectures with nodes that have both CPUs and accelerators provide new challenges, since they require more fine-grained parallelism and reduced data movement between different memories on a single node as well as across nodes. We will discuss our experiences and strategies to port hypre to heterogeneous computers with accelerators, including the design of a new memory model, the use of abstractions, the BoxLoop macros in the structured and semi-structured interfaces, and the restructuring of algebraic multigrid (AMG) into modular components. We present numerical experiments comparing CPU and GPU performance for several test problems. •Software effort on porting the hypre library to accelerator-based architectures.•Models of memory management and execution policy on heterogeneous platforms.•Portability strategy for structured interface, loop abstractions, and solvers.•Strategy to enable the unstructured interface, preconditioners, and solvers on GPUs.•Performance study of comparing various multigrid solvers running on CPUs and GPUs.</description><identifier>ISSN: 0167-8191</identifier><identifier>EISSN: 1872-7336</identifier><identifier>DOI: 10.1016/j.parco.2021.102840</identifier><language>eng</language><publisher>Netherlands: Elsevier B.V</publisher><subject>GPU computing ; HPC ; Iterative methods ; Multigrid methods ; Preconditioning techniques</subject><ispartof>Parallel computing, 2021-12, Vol.108 (C), p.102840, Article 102840</ispartof><rights>2021 Elsevier B.V.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c375t-c058b9b2095b2be229ca1cadd873f9bede45e8f224c7d8a5eb7180d6745a73653</citedby><cites>FETCH-LOGICAL-c375t-c058b9b2095b2be229ca1cadd873f9bede45e8f224c7d8a5eb7180d6745a73653</cites><orcidid>0000-0002-0927-4329 ; 0000-0003-4884-0087 ; 0000-0002-6957-0445 ; 0000-0003-2802-5763 ; 0000000348840087 ; 0000000269570445 ; 0000000209274329 ; 0000000328025763</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.parco.2021.102840$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>230,314,780,784,885,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttps://www.osti.gov/biblio/1823481$$D View this record in Osti.gov$$Hfree_for_read</backlink></links><search><creatorcontrib>Falgout, Robert D.</creatorcontrib><creatorcontrib>Li, Ruipeng</creatorcontrib><creatorcontrib>Sjögreen, Björn</creatorcontrib><creatorcontrib>Wang, Lu</creatorcontrib><creatorcontrib>Yang, Ulrike Meier</creatorcontrib><title>Porting hypre to heterogeneous computer architectures: Strategies and experiences</title><title>Parallel computing</title><description>Linear systems are occurring in many applications, and solving them can take a large amount of the total simulation time. The high performance library hypre provides a variety of interfaces and linear solvers, including various multigrid methods, that have achieved good scalability on a variety of homogeneous parallel computer architectures. Heterogeneous architectures with nodes that have both CPUs and accelerators provide new challenges, since they require more fine-grained parallelism and reduced data movement between different memories on a single node as well as across nodes. We will discuss our experiences and strategies to port hypre to heterogeneous computers with accelerators, including the design of a new memory model, the use of abstractions, the BoxLoop macros in the structured and semi-structured interfaces, and the restructuring of algebraic multigrid (AMG) into modular components. We present numerical experiments comparing CPU and GPU performance for several test problems. •Software effort on porting the hypre library to accelerator-based architectures.•Models of memory management and execution policy on heterogeneous platforms.•Portability strategy for structured interface, loop abstractions, and solvers.•Strategy to enable the unstructured interface, preconditioners, and solvers on GPUs.•Performance study of comparing various multigrid solvers running on CPUs and GPUs.</description><subject>GPU computing</subject><subject>HPC</subject><subject>Iterative methods</subject><subject>Multigrid methods</subject><subject>Preconditioning techniques</subject><issn>0167-8191</issn><issn>1872-7336</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLxDAUhYMoOI7-AjfBfcc82iYVXMjgCwQVdR3S5HYmg9OUJCPOvze1rl1dOJxzOPdD6JySBSW0vtwsBh2MXzDCaFaYLMkBmlEpWCE4rw_RLLtEIWlDj9FJjBtCSF1KMkOvLz4k16_wej8EwMnjNSQIfgU9-F3Exm-HXRZw7l-7BCbtAsQr_JaCTrByELHuLYbvAYKD3kA8RUed_oxw9nfn6OPu9n35UDw93z8ub54Kw0WVCkMq2TYtI03VshYYa4ymRlsrBe-aFiyUFciOsdIIK3UFraCS2FqUlRa8rvgcXUy9PianohnHrY3v-7xRUcl4KWk28clkgo8xQKeG4LY67BUlakSnNuoXnRrRqQldTl1PKcj7vxyEsX58zrowtlvv_s3_AEtuee4</recordid><startdate>202112</startdate><enddate>202112</enddate><creator>Falgout, Robert D.</creator><creator>Li, Ruipeng</creator><creator>Sjögreen, Björn</creator><creator>Wang, Lu</creator><creator>Yang, Ulrike Meier</creator><general>Elsevier B.V</general><general>Elsevier</general><scope>AAYXX</scope><scope>CITATION</scope><scope>OTOTI</scope><orcidid>https://orcid.org/0000-0002-0927-4329</orcidid><orcidid>https://orcid.org/0000-0003-4884-0087</orcidid><orcidid>https://orcid.org/0000-0002-6957-0445</orcidid><orcidid>https://orcid.org/0000-0003-2802-5763</orcidid><orcidid>https://orcid.org/0000000348840087</orcidid><orcidid>https://orcid.org/0000000269570445</orcidid><orcidid>https://orcid.org/0000000209274329</orcidid><orcidid>https://orcid.org/0000000328025763</orcidid></search><sort><creationdate>202112</creationdate><title>Porting hypre to heterogeneous computer architectures: Strategies and experiences</title><author>Falgout, Robert D. ; Li, Ruipeng ; Sjögreen, Björn ; Wang, Lu ; Yang, Ulrike Meier</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c375t-c058b9b2095b2be229ca1cadd873f9bede45e8f224c7d8a5eb7180d6745a73653</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>GPU computing</topic><topic>HPC</topic><topic>Iterative methods</topic><topic>Multigrid methods</topic><topic>Preconditioning techniques</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Falgout, Robert D.</creatorcontrib><creatorcontrib>Li, Ruipeng</creatorcontrib><creatorcontrib>Sjögreen, Björn</creatorcontrib><creatorcontrib>Wang, Lu</creatorcontrib><creatorcontrib>Yang, Ulrike Meier</creatorcontrib><collection>CrossRef</collection><collection>OSTI.GOV</collection><jtitle>Parallel computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Falgout, Robert D.</au><au>Li, Ruipeng</au><au>Sjögreen, Björn</au><au>Wang, Lu</au><au>Yang, Ulrike Meier</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Porting hypre to heterogeneous computer architectures: Strategies and experiences</atitle><jtitle>Parallel computing</jtitle><date>2021-12</date><risdate>2021</risdate><volume>108</volume><issue>C</issue><spage>102840</spage><pages>102840-</pages><artnum>102840</artnum><issn>0167-8191</issn><eissn>1872-7336</eissn><abstract>Linear systems are occurring in many applications, and solving them can take a large amount of the total simulation time. The high performance library hypre provides a variety of interfaces and linear solvers, including various multigrid methods, that have achieved good scalability on a variety of homogeneous parallel computer architectures. Heterogeneous architectures with nodes that have both CPUs and accelerators provide new challenges, since they require more fine-grained parallelism and reduced data movement between different memories on a single node as well as across nodes. We will discuss our experiences and strategies to port hypre to heterogeneous computers with accelerators, including the design of a new memory model, the use of abstractions, the BoxLoop macros in the structured and semi-structured interfaces, and the restructuring of algebraic multigrid (AMG) into modular components. We present numerical experiments comparing CPU and GPU performance for several test problems. •Software effort on porting the hypre library to accelerator-based architectures.•Models of memory management and execution policy on heterogeneous platforms.•Portability strategy for structured interface, loop abstractions, and solvers.•Strategy to enable the unstructured interface, preconditioners, and solvers on GPUs.•Performance study of comparing various multigrid solvers running on CPUs and GPUs.</abstract><cop>Netherlands</cop><pub>Elsevier B.V</pub><doi>10.1016/j.parco.2021.102840</doi><orcidid>https://orcid.org/0000-0002-0927-4329</orcidid><orcidid>https://orcid.org/0000-0003-4884-0087</orcidid><orcidid>https://orcid.org/0000-0002-6957-0445</orcidid><orcidid>https://orcid.org/0000-0003-2802-5763</orcidid><orcidid>https://orcid.org/0000000348840087</orcidid><orcidid>https://orcid.org/0000000269570445</orcidid><orcidid>https://orcid.org/0000000209274329</orcidid><orcidid>https://orcid.org/0000000328025763</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0167-8191
ispartof	Parallel computing, 2021-12, Vol.108 (C), p.102840, Article 102840
issn	0167-8191 1872-7336
language	eng
recordid	cdi_osti_scitechconnect_1823481
source	Access via ScienceDirect (Elsevier)
subjects	GPU computing HPC Iterative methods Multigrid methods Preconditioning techniques
title	Porting hypre to heterogeneous computer architectures: Strategies and experiences
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T20%3A03%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_osti_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Porting%20hypre%20to%20heterogeneous%20computer%20architectures:%20Strategies%20and%20experiences&rft.jtitle=Parallel%20computing&rft.au=Falgout,%20Robert%20D.&rft.date=2021-12&rft.volume=108&rft.issue=C&rft.spage=102840&rft.pages=102840-&rft.artnum=102840&rft.issn=0167-8191&rft.eissn=1872-7336&rft_id=info:doi/10.1016/j.parco.2021.102840&rft_dat=%3Celsevier_osti_%3ES0167819121000867%3C/elsevier_osti_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_els_id=S0167819121000867&rfr_iscdi=true