A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays
The impending termination of Moore's law motivates the search for new forms of computing to continue the performance scaling we have grown accustomed to. Among the many emerging Post-Moore computing candidates, perhaps none is as salient as the Field-Programmable Gate Array (FPGA), which offers...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2021-11 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Karp, Martin Podobas, Artur Kenter, Tobias Jansson, Niclas Plessl, Christian Schlatter, Philipp Markidis, Stefano |
description | The impending termination of Moore's law motivates the search for new forms of computing to continue the performance scaling we have grown accustomed to. Among the many emerging Post-Moore computing candidates, perhaps none is as salient as the Field-Programmable Gate Array (FPGA), which offers the means of specializing and customizing the hardware to the computation at hand. In this work, we design a custom FPGA-based accelerator for a computational fluid dynamics (CFD) code. Unlike prior work -- which often focuses on accelerating small kernels -- we target the entire Poisson solver on unstructured meshes based on the high-fidelity spectral element method (SEM) used in modern state-of-the-art CFD systems. We model our accelerator using an analytical performance model based on the I/O cost of the algorithm. We empirically evaluate our accelerator on a state-of-the-art Intel Stratix 10 FPGA in terms of performance and power consumption and contrast it against existing solutions on general-purpose processors (CPUs). Finally, we propose a data movement-reducing technique where we compute geometric factors on the fly, which yields significant (700+ Gflop/s) single-precision performance and an upwards of 2x reduction in runtime for the local evaluation of the Laplace operator. We end the paper by discussing the challenges and opportunities of using reconfigurable architecture in the future, particularly in the light of emerging (not yet available) technologies. |
doi_str_mv | 10.48550/arxiv.2108.12188 |
format | Article |
fullrecord | <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2108_12188</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2567811585</sourcerecordid><originalsourceid>FETCH-LOGICAL-a525-b0d12e44d177836dde50dbb081ea909fd552a84595a951787e90093b46504d6f3</originalsourceid><addsrcrecordid>eNotj81OwkAURicmJhLkAVw5ievi_N12uiTEgglGE2HdTJ1bKCkM3mlR3l4UV9_m5Ms5jN1JMTYWQDw6-m6OYyWFHUslrb1iA6W1TKxR6oaNYtwKIVSaKQA9YKsJnzfrTVI0HtumO_GiDV_8PbRHJF4H4qt97Kj_6HpCz18wbjDysOdFg61P3iisye12rmqRz1yHfELkTvGWXdeujTj63yFbFk_L6TxZvM6ep5NF4kBBUgkvFRrjZZZZnXqPIHxVCSvR5SKvPYBy1kAOLgeZ2QxzIXJdmRSE8Wmth-z-cvvXXB6o2Tk6lb_t5V_7mXi4EAcKnz3GrtyGnvZnp1JBmlkpwYL-AZKSWwc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2567811585</pqid></control><display><type>article</type><title>A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Karp, Martin ; Podobas, Artur ; Kenter, Tobias ; Jansson, Niclas ; Plessl, Christian ; Schlatter, Philipp ; Markidis, Stefano</creator><creatorcontrib>Karp, Martin ; Podobas, Artur ; Kenter, Tobias ; Jansson, Niclas ; Plessl, Christian ; Schlatter, Philipp ; Markidis, Stefano</creatorcontrib><description>The impending termination of Moore's law motivates the search for new forms of computing to continue the performance scaling we have grown accustomed to. Among the many emerging Post-Moore computing candidates, perhaps none is as salient as the Field-Programmable Gate Array (FPGA), which offers the means of specializing and customizing the hardware to the computation at hand. In this work, we design a custom FPGA-based accelerator for a computational fluid dynamics (CFD) code. Unlike prior work -- which often focuses on accelerating small kernels -- we target the entire Poisson solver on unstructured meshes based on the high-fidelity spectral element method (SEM) used in modern state-of-the-art CFD systems. We model our accelerator using an analytical performance model based on the I/O cost of the algorithm. We empirically evaluate our accelerator on a state-of-the-art Intel Stratix 10 FPGA in terms of performance and power consumption and contrast it against existing solutions on general-purpose processors (CPUs). Finally, we propose a data movement-reducing technique where we compute geometric factors on the fly, which yields significant (700+ Gflop/s) single-precision performance and an upwards of 2x reduction in runtime for the local evaluation of the Laplace operator. We end the paper by discussing the challenges and opportunities of using reconfigurable architecture in the future, particularly in the light of emerging (not yet available) technologies.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2108.12188</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Algorithms ; Computational fluid dynamics ; Computer Science - Distributed, Parallel, and Cluster Computing ; Cost analysis ; Field programmable gate arrays ; Mathematical models ; Moore's law ; Power consumption ; Solvers ; Spectral element method</subject><ispartof>arXiv.org, 2021-11</ispartof><rights>2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27904</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.2108.12188$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1145/3492805.3492808$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Karp, Martin</creatorcontrib><creatorcontrib>Podobas, Artur</creatorcontrib><creatorcontrib>Kenter, Tobias</creatorcontrib><creatorcontrib>Jansson, Niclas</creatorcontrib><creatorcontrib>Plessl, Christian</creatorcontrib><creatorcontrib>Schlatter, Philipp</creatorcontrib><creatorcontrib>Markidis, Stefano</creatorcontrib><title>A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays</title><title>arXiv.org</title><description>The impending termination of Moore's law motivates the search for new forms of computing to continue the performance scaling we have grown accustomed to. Among the many emerging Post-Moore computing candidates, perhaps none is as salient as the Field-Programmable Gate Array (FPGA), which offers the means of specializing and customizing the hardware to the computation at hand. In this work, we design a custom FPGA-based accelerator for a computational fluid dynamics (CFD) code. Unlike prior work -- which often focuses on accelerating small kernels -- we target the entire Poisson solver on unstructured meshes based on the high-fidelity spectral element method (SEM) used in modern state-of-the-art CFD systems. We model our accelerator using an analytical performance model based on the I/O cost of the algorithm. We empirically evaluate our accelerator on a state-of-the-art Intel Stratix 10 FPGA in terms of performance and power consumption and contrast it against existing solutions on general-purpose processors (CPUs). Finally, we propose a data movement-reducing technique where we compute geometric factors on the fly, which yields significant (700+ Gflop/s) single-precision performance and an upwards of 2x reduction in runtime for the local evaluation of the Laplace operator. We end the paper by discussing the challenges and opportunities of using reconfigurable architecture in the future, particularly in the light of emerging (not yet available) technologies.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Computational fluid dynamics</subject><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><subject>Cost analysis</subject><subject>Field programmable gate arrays</subject><subject>Mathematical models</subject><subject>Moore's law</subject><subject>Power consumption</subject><subject>Solvers</subject><subject>Spectral element method</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotj81OwkAURicmJhLkAVw5ievi_N12uiTEgglGE2HdTJ1bKCkM3mlR3l4UV9_m5Ms5jN1JMTYWQDw6-m6OYyWFHUslrb1iA6W1TKxR6oaNYtwKIVSaKQA9YKsJnzfrTVI0HtumO_GiDV_8PbRHJF4H4qt97Kj_6HpCz18wbjDysOdFg61P3iisye12rmqRz1yHfELkTvGWXdeujTj63yFbFk_L6TxZvM6ep5NF4kBBUgkvFRrjZZZZnXqPIHxVCSvR5SKvPYBy1kAOLgeZ2QxzIXJdmRSE8Wmth-z-cvvXXB6o2Tk6lb_t5V_7mXi4EAcKnz3GrtyGnvZnp1JBmlkpwYL-AZKSWwc</recordid><startdate>20211102</startdate><enddate>20211102</enddate><creator>Karp, Martin</creator><creator>Podobas, Artur</creator><creator>Kenter, Tobias</creator><creator>Jansson, Niclas</creator><creator>Plessl, Christian</creator><creator>Schlatter, Philipp</creator><creator>Markidis, Stefano</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20211102</creationdate><title>A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays</title><author>Karp, Martin ; Podobas, Artur ; Kenter, Tobias ; Jansson, Niclas ; Plessl, Christian ; Schlatter, Philipp ; Markidis, Stefano</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a525-b0d12e44d177836dde50dbb081ea909fd552a84595a951787e90093b46504d6f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Computational fluid dynamics</topic><topic>Computer Science - Distributed, Parallel, and Cluster Computing</topic><topic>Cost analysis</topic><topic>Field programmable gate arrays</topic><topic>Mathematical models</topic><topic>Moore's law</topic><topic>Power consumption</topic><topic>Solvers</topic><topic>Spectral element method</topic><toplevel>online_resources</toplevel><creatorcontrib>Karp, Martin</creatorcontrib><creatorcontrib>Podobas, Artur</creatorcontrib><creatorcontrib>Kenter, Tobias</creatorcontrib><creatorcontrib>Jansson, Niclas</creatorcontrib><creatorcontrib>Plessl, Christian</creatorcontrib><creatorcontrib>Schlatter, Philipp</creatorcontrib><creatorcontrib>Markidis, Stefano</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Karp, Martin</au><au>Podobas, Artur</au><au>Kenter, Tobias</au><au>Jansson, Niclas</au><au>Plessl, Christian</au><au>Schlatter, Philipp</au><au>Markidis, Stefano</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays</atitle><jtitle>arXiv.org</jtitle><date>2021-11-02</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>The impending termination of Moore's law motivates the search for new forms of computing to continue the performance scaling we have grown accustomed to. Among the many emerging Post-Moore computing candidates, perhaps none is as salient as the Field-Programmable Gate Array (FPGA), which offers the means of specializing and customizing the hardware to the computation at hand. In this work, we design a custom FPGA-based accelerator for a computational fluid dynamics (CFD) code. Unlike prior work -- which often focuses on accelerating small kernels -- we target the entire Poisson solver on unstructured meshes based on the high-fidelity spectral element method (SEM) used in modern state-of-the-art CFD systems. We model our accelerator using an analytical performance model based on the I/O cost of the algorithm. We empirically evaluate our accelerator on a state-of-the-art Intel Stratix 10 FPGA in terms of performance and power consumption and contrast it against existing solutions on general-purpose processors (CPUs). Finally, we propose a data movement-reducing technique where we compute geometric factors on the fly, which yields significant (700+ Gflop/s) single-precision performance and an upwards of 2x reduction in runtime for the local evaluation of the Laplace operator. We end the paper by discussing the challenges and opportunities of using reconfigurable architecture in the future, particularly in the light of emerging (not yet available) technologies.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2108.12188</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2021-11 |
issn | 2331-8422 |
language | eng |
recordid | cdi_arxiv_primary_2108_12188 |
source | arXiv.org; Free E- Journals |
subjects | Accuracy Algorithms Computational fluid dynamics Computer Science - Distributed, Parallel, and Cluster Computing Cost analysis Field programmable gate arrays Mathematical models Moore's law Power consumption Solvers Spectral element method |
title | A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T12%3A56%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20High-Fidelity%20Flow%20Solver%20for%20Unstructured%20Meshes%20on%20Field-Programmable%20Gate%20Arrays&rft.jtitle=arXiv.org&rft.au=Karp,%20Martin&rft.date=2021-11-02&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2108.12188&rft_dat=%3Cproquest_arxiv%3E2567811585%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2567811585&rft_id=info:pmid/&rfr_iscdi=true |