A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays

The impending termination of Moore's law motivates the search for new forms of computing to continue the performance scaling we have grown accustomed to. Among the many emerging Post-Moore computing candidates, perhaps none is as salient as the Field-Programmable Gate Array (FPGA), which offers...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2021-11
Hauptverfasser: Karp, Martin, Podobas, Artur, Kenter, Tobias, Jansson, Niclas, Plessl, Christian, Schlatter, Philipp, Markidis, Stefano
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Karp, Martin
Podobas, Artur
Kenter, Tobias
Jansson, Niclas
Plessl, Christian
Schlatter, Philipp
Markidis, Stefano
description The impending termination of Moore's law motivates the search for new forms of computing to continue the performance scaling we have grown accustomed to. Among the many emerging Post-Moore computing candidates, perhaps none is as salient as the Field-Programmable Gate Array (FPGA), which offers the means of specializing and customizing the hardware to the computation at hand. In this work, we design a custom FPGA-based accelerator for a computational fluid dynamics (CFD) code. Unlike prior work -- which often focuses on accelerating small kernels -- we target the entire Poisson solver on unstructured meshes based on the high-fidelity spectral element method (SEM) used in modern state-of-the-art CFD systems. We model our accelerator using an analytical performance model based on the I/O cost of the algorithm. We empirically evaluate our accelerator on a state-of-the-art Intel Stratix 10 FPGA in terms of performance and power consumption and contrast it against existing solutions on general-purpose processors (CPUs). Finally, we propose a data movement-reducing technique where we compute geometric factors on the fly, which yields significant (700+ Gflop/s) single-precision performance and an upwards of 2x reduction in runtime for the local evaluation of the Laplace operator. We end the paper by discussing the challenges and opportunities of using reconfigurable architecture in the future, particularly in the light of emerging (not yet available) technologies.
doi_str_mv 10.48550/arxiv.2108.12188
format Article
fullrecord <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2108_12188</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2567811585</sourcerecordid><originalsourceid>FETCH-LOGICAL-a525-b0d12e44d177836dde50dbb081ea909fd552a84595a951787e90093b46504d6f3</originalsourceid><addsrcrecordid>eNotj81OwkAURicmJhLkAVw5ievi_N12uiTEgglGE2HdTJ1bKCkM3mlR3l4UV9_m5Ms5jN1JMTYWQDw6-m6OYyWFHUslrb1iA6W1TKxR6oaNYtwKIVSaKQA9YKsJnzfrTVI0HtumO_GiDV_8PbRHJF4H4qt97Kj_6HpCz18wbjDysOdFg61P3iisye12rmqRz1yHfELkTvGWXdeujTj63yFbFk_L6TxZvM6ep5NF4kBBUgkvFRrjZZZZnXqPIHxVCSvR5SKvPYBy1kAOLgeZ2QxzIXJdmRSE8Wmth-z-cvvXXB6o2Tk6lb_t5V_7mXi4EAcKnz3GrtyGnvZnp1JBmlkpwYL-AZKSWwc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2567811585</pqid></control><display><type>article</type><title>A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Karp, Martin ; Podobas, Artur ; Kenter, Tobias ; Jansson, Niclas ; Plessl, Christian ; Schlatter, Philipp ; Markidis, Stefano</creator><creatorcontrib>Karp, Martin ; Podobas, Artur ; Kenter, Tobias ; Jansson, Niclas ; Plessl, Christian ; Schlatter, Philipp ; Markidis, Stefano</creatorcontrib><description>The impending termination of Moore's law motivates the search for new forms of computing to continue the performance scaling we have grown accustomed to. Among the many emerging Post-Moore computing candidates, perhaps none is as salient as the Field-Programmable Gate Array (FPGA), which offers the means of specializing and customizing the hardware to the computation at hand. In this work, we design a custom FPGA-based accelerator for a computational fluid dynamics (CFD) code. Unlike prior work -- which often focuses on accelerating small kernels -- we target the entire Poisson solver on unstructured meshes based on the high-fidelity spectral element method (SEM) used in modern state-of-the-art CFD systems. We model our accelerator using an analytical performance model based on the I/O cost of the algorithm. We empirically evaluate our accelerator on a state-of-the-art Intel Stratix 10 FPGA in terms of performance and power consumption and contrast it against existing solutions on general-purpose processors (CPUs). Finally, we propose a data movement-reducing technique where we compute geometric factors on the fly, which yields significant (700+ Gflop/s) single-precision performance and an upwards of 2x reduction in runtime for the local evaluation of the Laplace operator. We end the paper by discussing the challenges and opportunities of using reconfigurable architecture in the future, particularly in the light of emerging (not yet available) technologies.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2108.12188</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Algorithms ; Computational fluid dynamics ; Computer Science - Distributed, Parallel, and Cluster Computing ; Cost analysis ; Field programmable gate arrays ; Mathematical models ; Moore's law ; Power consumption ; Solvers ; Spectral element method</subject><ispartof>arXiv.org, 2021-11</ispartof><rights>2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27904</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.2108.12188$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1145/3492805.3492808$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Karp, Martin</creatorcontrib><creatorcontrib>Podobas, Artur</creatorcontrib><creatorcontrib>Kenter, Tobias</creatorcontrib><creatorcontrib>Jansson, Niclas</creatorcontrib><creatorcontrib>Plessl, Christian</creatorcontrib><creatorcontrib>Schlatter, Philipp</creatorcontrib><creatorcontrib>Markidis, Stefano</creatorcontrib><title>A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays</title><title>arXiv.org</title><description>The impending termination of Moore's law motivates the search for new forms of computing to continue the performance scaling we have grown accustomed to. Among the many emerging Post-Moore computing candidates, perhaps none is as salient as the Field-Programmable Gate Array (FPGA), which offers the means of specializing and customizing the hardware to the computation at hand. In this work, we design a custom FPGA-based accelerator for a computational fluid dynamics (CFD) code. Unlike prior work -- which often focuses on accelerating small kernels -- we target the entire Poisson solver on unstructured meshes based on the high-fidelity spectral element method (SEM) used in modern state-of-the-art CFD systems. We model our accelerator using an analytical performance model based on the I/O cost of the algorithm. We empirically evaluate our accelerator on a state-of-the-art Intel Stratix 10 FPGA in terms of performance and power consumption and contrast it against existing solutions on general-purpose processors (CPUs). Finally, we propose a data movement-reducing technique where we compute geometric factors on the fly, which yields significant (700+ Gflop/s) single-precision performance and an upwards of 2x reduction in runtime for the local evaluation of the Laplace operator. We end the paper by discussing the challenges and opportunities of using reconfigurable architecture in the future, particularly in the light of emerging (not yet available) technologies.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Computational fluid dynamics</subject><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><subject>Cost analysis</subject><subject>Field programmable gate arrays</subject><subject>Mathematical models</subject><subject>Moore's law</subject><subject>Power consumption</subject><subject>Solvers</subject><subject>Spectral element method</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotj81OwkAURicmJhLkAVw5ievi_N12uiTEgglGE2HdTJ1bKCkM3mlR3l4UV9_m5Ms5jN1JMTYWQDw6-m6OYyWFHUslrb1iA6W1TKxR6oaNYtwKIVSaKQA9YKsJnzfrTVI0HtumO_GiDV_8PbRHJF4H4qt97Kj_6HpCz18wbjDysOdFg61P3iisye12rmqRz1yHfELkTvGWXdeujTj63yFbFk_L6TxZvM6ep5NF4kBBUgkvFRrjZZZZnXqPIHxVCSvR5SKvPYBy1kAOLgeZ2QxzIXJdmRSE8Wmth-z-cvvXXB6o2Tk6lb_t5V_7mXi4EAcKnz3GrtyGnvZnp1JBmlkpwYL-AZKSWwc</recordid><startdate>20211102</startdate><enddate>20211102</enddate><creator>Karp, Martin</creator><creator>Podobas, Artur</creator><creator>Kenter, Tobias</creator><creator>Jansson, Niclas</creator><creator>Plessl, Christian</creator><creator>Schlatter, Philipp</creator><creator>Markidis, Stefano</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20211102</creationdate><title>A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays</title><author>Karp, Martin ; Podobas, Artur ; Kenter, Tobias ; Jansson, Niclas ; Plessl, Christian ; Schlatter, Philipp ; Markidis, Stefano</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a525-b0d12e44d177836dde50dbb081ea909fd552a84595a951787e90093b46504d6f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Computational fluid dynamics</topic><topic>Computer Science - Distributed, Parallel, and Cluster Computing</topic><topic>Cost analysis</topic><topic>Field programmable gate arrays</topic><topic>Mathematical models</topic><topic>Moore's law</topic><topic>Power consumption</topic><topic>Solvers</topic><topic>Spectral element method</topic><toplevel>online_resources</toplevel><creatorcontrib>Karp, Martin</creatorcontrib><creatorcontrib>Podobas, Artur</creatorcontrib><creatorcontrib>Kenter, Tobias</creatorcontrib><creatorcontrib>Jansson, Niclas</creatorcontrib><creatorcontrib>Plessl, Christian</creatorcontrib><creatorcontrib>Schlatter, Philipp</creatorcontrib><creatorcontrib>Markidis, Stefano</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Karp, Martin</au><au>Podobas, Artur</au><au>Kenter, Tobias</au><au>Jansson, Niclas</au><au>Plessl, Christian</au><au>Schlatter, Philipp</au><au>Markidis, Stefano</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays</atitle><jtitle>arXiv.org</jtitle><date>2021-11-02</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>The impending termination of Moore's law motivates the search for new forms of computing to continue the performance scaling we have grown accustomed to. Among the many emerging Post-Moore computing candidates, perhaps none is as salient as the Field-Programmable Gate Array (FPGA), which offers the means of specializing and customizing the hardware to the computation at hand. In this work, we design a custom FPGA-based accelerator for a computational fluid dynamics (CFD) code. Unlike prior work -- which often focuses on accelerating small kernels -- we target the entire Poisson solver on unstructured meshes based on the high-fidelity spectral element method (SEM) used in modern state-of-the-art CFD systems. We model our accelerator using an analytical performance model based on the I/O cost of the algorithm. We empirically evaluate our accelerator on a state-of-the-art Intel Stratix 10 FPGA in terms of performance and power consumption and contrast it against existing solutions on general-purpose processors (CPUs). Finally, we propose a data movement-reducing technique where we compute geometric factors on the fly, which yields significant (700+ Gflop/s) single-precision performance and an upwards of 2x reduction in runtime for the local evaluation of the Laplace operator. We end the paper by discussing the challenges and opportunities of using reconfigurable architecture in the future, particularly in the light of emerging (not yet available) technologies.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2108.12188</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2021-11
issn 2331-8422
language eng
recordid cdi_arxiv_primary_2108_12188
source arXiv.org; Free E- Journals
subjects Accuracy
Algorithms
Computational fluid dynamics
Computer Science - Distributed, Parallel, and Cluster Computing
Cost analysis
Field programmable gate arrays
Mathematical models
Moore's law
Power consumption
Solvers
Spectral element method
title A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T12%3A56%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20High-Fidelity%20Flow%20Solver%20for%20Unstructured%20Meshes%20on%20Field-Programmable%20Gate%20Arrays&rft.jtitle=arXiv.org&rft.au=Karp,%20Martin&rft.date=2021-11-02&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2108.12188&rft_dat=%3Cproquest_arxiv%3E2567811585%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2567811585&rft_id=info:pmid/&rfr_iscdi=true