High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation

The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kiss, I., Badics, Z., Gyimothy, S., Pavo, J.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 5
container_issue
container_start_page 1
container_title
container_volume
creator Kiss, I.
Badics, Z.
Gyimothy, S.
Pavo, J.
description The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i.e., without assembling the global system matrix. In our implementation, NVIDIA's parallel computing solution, the Compute Unified Device Architecture (CUDA), is used to perform the required element-wise computations in parallel. Since element matrices need not be stored, the memory requirement can be kept extremely low. It is shown that this low-storage but computation-intensive technique is better suited for GPUs than those requiring the massive manipulation of large data sets. This study of the proposed parallel model illustrates a highly improved locality and minimization of data movement, which could also significantly reduce energy consumption in other heterogeneous HPC architectures.
doi_str_mv 10.1109/HPEC.2012.6408659
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6408659</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6408659</ieee_id><sourcerecordid>6408659</sourcerecordid><originalsourceid>FETCH-LOGICAL-c289t-6a0963fb31e0e7632897d5ee559ab166fa5bc5f2de719580137ba7dcf44ab7ee3</originalsourceid><addsrcrecordid>eNo1kMFKAzEQhiMiqLUPIF7yAluTzSbZHKXUVijYgwVvZbI7qZFsUjZLYZ_A17ZSe5rv_-ef_zCEPHI245yZ59VmMZ-VjJczVbFaSXNFpkbXvFJacKmluCb3F6E_b8k052_G2OlWlZLdkZ-V33_RkBoIfhgpxJb62PQIGf9o6KGIqUV6gB5CwOBzR13qaU7h6OOeOh_9gBQDdhgH2p2yIdMU6XKzzdSONKYjhsu-sGNxifrucCYYfIoP5MZByDj9nxOyfV18zFfF-n35Nn9ZF01Zm6FQwIwSzgqODLUSJ1O3ElFKA5Yr5UDaRrqyRc2NrBkX2oJuG1dVYDWimJCnc69HxN2h9x304-7_d-IXR7JlyQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Kiss, I. ; Badics, Z. ; Gyimothy, S. ; Pavo, J.</creator><creatorcontrib>Kiss, I. ; Badics, Z. ; Gyimothy, S. ; Pavo, J.</creatorcontrib><description>The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i.e., without assembling the global system matrix. In our implementation, NVIDIA's parallel computing solution, the Compute Unified Device Architecture (CUDA), is used to perform the required element-wise computations in parallel. Since element matrices need not be stored, the memory requirement can be kept extremely low. It is shown that this low-storage but computation-intensive technique is better suited for GPUs than those requiring the massive manipulation of large data sets. This study of the proposed parallel model illustrates a highly improved locality and minimization of data movement, which could also significantly reduce energy consumption in other heterogeneous HPC architectures.</description><identifier>ISBN: 146731577X</identifier><identifier>ISBN: 9781467315777</identifier><identifier>EISBN: 9781467315753</identifier><identifier>EISBN: 1467315745</identifier><identifier>EISBN: 1467315761</identifier><identifier>EISBN: 9781467315760</identifier><identifier>EISBN: 9781467315746</identifier><identifier>EISBN: 1467315753</identifier><identifier>DOI: 10.1109/HPEC.2012.6408659</identifier><language>eng</language><publisher>IEEE</publisher><subject>Computational modeling ; CUDA Computing ; EbE FEM ; Finite element methods ; GPU Computing ; Graphics processing units ; Kernel ; Matrix decomposition ; parallel FEM ; Sparse matrices ; Vectors</subject><ispartof>2012 IEEE Conference on High Performance Extreme Computing, 2012, p.1-5</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c289t-6a0963fb31e0e7632897d5ee559ab166fa5bc5f2de719580137ba7dcf44ab7ee3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6408659$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2056,27916,54911</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6408659$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kiss, I.</creatorcontrib><creatorcontrib>Badics, Z.</creatorcontrib><creatorcontrib>Gyimothy, S.</creatorcontrib><creatorcontrib>Pavo, J.</creatorcontrib><title>High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation</title><title>2012 IEEE Conference on High Performance Extreme Computing</title><addtitle>HPEC</addtitle><description>The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i.e., without assembling the global system matrix. In our implementation, NVIDIA's parallel computing solution, the Compute Unified Device Architecture (CUDA), is used to perform the required element-wise computations in parallel. Since element matrices need not be stored, the memory requirement can be kept extremely low. It is shown that this low-storage but computation-intensive technique is better suited for GPUs than those requiring the massive manipulation of large data sets. This study of the proposed parallel model illustrates a highly improved locality and minimization of data movement, which could also significantly reduce energy consumption in other heterogeneous HPC architectures.</description><subject>Computational modeling</subject><subject>CUDA Computing</subject><subject>EbE FEM</subject><subject>Finite element methods</subject><subject>GPU Computing</subject><subject>Graphics processing units</subject><subject>Kernel</subject><subject>Matrix decomposition</subject><subject>parallel FEM</subject><subject>Sparse matrices</subject><subject>Vectors</subject><isbn>146731577X</isbn><isbn>9781467315777</isbn><isbn>9781467315753</isbn><isbn>1467315745</isbn><isbn>1467315761</isbn><isbn>9781467315760</isbn><isbn>9781467315746</isbn><isbn>1467315753</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1kMFKAzEQhiMiqLUPIF7yAluTzSbZHKXUVijYgwVvZbI7qZFsUjZLYZ_A17ZSe5rv_-ef_zCEPHI245yZ59VmMZ-VjJczVbFaSXNFpkbXvFJacKmluCb3F6E_b8k052_G2OlWlZLdkZ-V33_RkBoIfhgpxJb62PQIGf9o6KGIqUV6gB5CwOBzR13qaU7h6OOeOh_9gBQDdhgH2p2yIdMU6XKzzdSONKYjhsu-sGNxifrucCYYfIoP5MZByDj9nxOyfV18zFfF-n35Nn9ZF01Zm6FQwIwSzgqODLUSJ1O3ElFKA5Yr5UDaRrqyRc2NrBkX2oJuG1dVYDWimJCnc69HxN2h9x304-7_d-IXR7JlyQ</recordid><startdate>201209</startdate><enddate>201209</enddate><creator>Kiss, I.</creator><creator>Badics, Z.</creator><creator>Gyimothy, S.</creator><creator>Pavo, J.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201209</creationdate><title>High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation</title><author>Kiss, I. ; Badics, Z. ; Gyimothy, S. ; Pavo, J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c289t-6a0963fb31e0e7632897d5ee559ab166fa5bc5f2de719580137ba7dcf44ab7ee3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Computational modeling</topic><topic>CUDA Computing</topic><topic>EbE FEM</topic><topic>Finite element methods</topic><topic>GPU Computing</topic><topic>Graphics processing units</topic><topic>Kernel</topic><topic>Matrix decomposition</topic><topic>parallel FEM</topic><topic>Sparse matrices</topic><topic>Vectors</topic><toplevel>online_resources</toplevel><creatorcontrib>Kiss, I.</creatorcontrib><creatorcontrib>Badics, Z.</creatorcontrib><creatorcontrib>Gyimothy, S.</creatorcontrib><creatorcontrib>Pavo, J.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kiss, I.</au><au>Badics, Z.</au><au>Gyimothy, S.</au><au>Pavo, J.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation</atitle><btitle>2012 IEEE Conference on High Performance Extreme Computing</btitle><stitle>HPEC</stitle><date>2012-09</date><risdate>2012</risdate><spage>1</spage><epage>5</epage><pages>1-5</pages><isbn>146731577X</isbn><isbn>9781467315777</isbn><eisbn>9781467315753</eisbn><eisbn>1467315745</eisbn><eisbn>1467315761</eisbn><eisbn>9781467315760</eisbn><eisbn>9781467315746</eisbn><eisbn>1467315753</eisbn><abstract>The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i.e., without assembling the global system matrix. In our implementation, NVIDIA's parallel computing solution, the Compute Unified Device Architecture (CUDA), is used to perform the required element-wise computations in parallel. Since element matrices need not be stored, the memory requirement can be kept extremely low. It is shown that this low-storage but computation-intensive technique is better suited for GPUs than those requiring the massive manipulation of large data sets. This study of the proposed parallel model illustrates a highly improved locality and minimization of data movement, which could also significantly reduce energy consumption in other heterogeneous HPC architectures.</abstract><pub>IEEE</pub><doi>10.1109/HPEC.2012.6408659</doi><tpages>5</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISBN: 146731577X
ispartof 2012 IEEE Conference on High Performance Extreme Computing, 2012, p.1-5
issn
language eng
recordid cdi_ieee_primary_6408659
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Computational modeling
CUDA Computing
EbE FEM
Finite element methods
GPU Computing
Graphics processing units
Kernel
Matrix decomposition
parallel FEM
Sparse matrices
Vectors
title High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T07%3A06%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=High%20locality%20and%20increased%20intra-node%20parallelism%20for%20solving%20finite%20element%20models%20on%20GPUs%20by%20novel%20element-by-element%20implementation&rft.btitle=2012%20IEEE%20Conference%20on%20High%20Performance%20Extreme%20Computing&rft.au=Kiss,%20I.&rft.date=2012-09&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.isbn=146731577X&rft.isbn_list=9781467315777&rft_id=info:doi/10.1109/HPEC.2012.6408659&rft_dat=%3Cieee_6IE%3E6408659%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781467315753&rft.eisbn_list=1467315745&rft.eisbn_list=1467315761&rft.eisbn_list=9781467315760&rft.eisbn_list=9781467315746&rft.eisbn_list=1467315753&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6408659&rfr_iscdi=true