High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation
The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 5 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | |
container_volume | |
creator | Kiss, I. Badics, Z. Gyimothy, S. Pavo, J. |
description | The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i.e., without assembling the global system matrix. In our implementation, NVIDIA's parallel computing solution, the Compute Unified Device Architecture (CUDA), is used to perform the required element-wise computations in parallel. Since element matrices need not be stored, the memory requirement can be kept extremely low. It is shown that this low-storage but computation-intensive technique is better suited for GPUs than those requiring the massive manipulation of large data sets. This study of the proposed parallel model illustrates a highly improved locality and minimization of data movement, which could also significantly reduce energy consumption in other heterogeneous HPC architectures. |
doi_str_mv | 10.1109/HPEC.2012.6408659 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6408659</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6408659</ieee_id><sourcerecordid>6408659</sourcerecordid><originalsourceid>FETCH-LOGICAL-c289t-6a0963fb31e0e7632897d5ee559ab166fa5bc5f2de719580137ba7dcf44ab7ee3</originalsourceid><addsrcrecordid>eNo1kMFKAzEQhiMiqLUPIF7yAluTzSbZHKXUVijYgwVvZbI7qZFsUjZLYZ_A17ZSe5rv_-ef_zCEPHI245yZ59VmMZ-VjJczVbFaSXNFpkbXvFJacKmluCb3F6E_b8k052_G2OlWlZLdkZ-V33_RkBoIfhgpxJb62PQIGf9o6KGIqUV6gB5CwOBzR13qaU7h6OOeOh_9gBQDdhgH2p2yIdMU6XKzzdSONKYjhsu-sGNxifrucCYYfIoP5MZByDj9nxOyfV18zFfF-n35Nn9ZF01Zm6FQwIwSzgqODLUSJ1O3ElFKA5Yr5UDaRrqyRc2NrBkX2oJuG1dVYDWimJCnc69HxN2h9x304-7_d-IXR7JlyQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Kiss, I. ; Badics, Z. ; Gyimothy, S. ; Pavo, J.</creator><creatorcontrib>Kiss, I. ; Badics, Z. ; Gyimothy, S. ; Pavo, J.</creatorcontrib><description>The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i.e., without assembling the global system matrix. In our implementation, NVIDIA's parallel computing solution, the Compute Unified Device Architecture (CUDA), is used to perform the required element-wise computations in parallel. Since element matrices need not be stored, the memory requirement can be kept extremely low. It is shown that this low-storage but computation-intensive technique is better suited for GPUs than those requiring the massive manipulation of large data sets. This study of the proposed parallel model illustrates a highly improved locality and minimization of data movement, which could also significantly reduce energy consumption in other heterogeneous HPC architectures.</description><identifier>ISBN: 146731577X</identifier><identifier>ISBN: 9781467315777</identifier><identifier>EISBN: 9781467315753</identifier><identifier>EISBN: 1467315745</identifier><identifier>EISBN: 1467315761</identifier><identifier>EISBN: 9781467315760</identifier><identifier>EISBN: 9781467315746</identifier><identifier>EISBN: 1467315753</identifier><identifier>DOI: 10.1109/HPEC.2012.6408659</identifier><language>eng</language><publisher>IEEE</publisher><subject>Computational modeling ; CUDA Computing ; EbE FEM ; Finite element methods ; GPU Computing ; Graphics processing units ; Kernel ; Matrix decomposition ; parallel FEM ; Sparse matrices ; Vectors</subject><ispartof>2012 IEEE Conference on High Performance Extreme Computing, 2012, p.1-5</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c289t-6a0963fb31e0e7632897d5ee559ab166fa5bc5f2de719580137ba7dcf44ab7ee3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6408659$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2056,27916,54911</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6408659$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kiss, I.</creatorcontrib><creatorcontrib>Badics, Z.</creatorcontrib><creatorcontrib>Gyimothy, S.</creatorcontrib><creatorcontrib>Pavo, J.</creatorcontrib><title>High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation</title><title>2012 IEEE Conference on High Performance Extreme Computing</title><addtitle>HPEC</addtitle><description>The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i.e., without assembling the global system matrix. In our implementation, NVIDIA's parallel computing solution, the Compute Unified Device Architecture (CUDA), is used to perform the required element-wise computations in parallel. Since element matrices need not be stored, the memory requirement can be kept extremely low. It is shown that this low-storage but computation-intensive technique is better suited for GPUs than those requiring the massive manipulation of large data sets. This study of the proposed parallel model illustrates a highly improved locality and minimization of data movement, which could also significantly reduce energy consumption in other heterogeneous HPC architectures.</description><subject>Computational modeling</subject><subject>CUDA Computing</subject><subject>EbE FEM</subject><subject>Finite element methods</subject><subject>GPU Computing</subject><subject>Graphics processing units</subject><subject>Kernel</subject><subject>Matrix decomposition</subject><subject>parallel FEM</subject><subject>Sparse matrices</subject><subject>Vectors</subject><isbn>146731577X</isbn><isbn>9781467315777</isbn><isbn>9781467315753</isbn><isbn>1467315745</isbn><isbn>1467315761</isbn><isbn>9781467315760</isbn><isbn>9781467315746</isbn><isbn>1467315753</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1kMFKAzEQhiMiqLUPIF7yAluTzSbZHKXUVijYgwVvZbI7qZFsUjZLYZ_A17ZSe5rv_-ef_zCEPHI245yZ59VmMZ-VjJczVbFaSXNFpkbXvFJacKmluCb3F6E_b8k052_G2OlWlZLdkZ-V33_RkBoIfhgpxJb62PQIGf9o6KGIqUV6gB5CwOBzR13qaU7h6OOeOh_9gBQDdhgH2p2yIdMU6XKzzdSONKYjhsu-sGNxifrucCYYfIoP5MZByDj9nxOyfV18zFfF-n35Nn9ZF01Zm6FQwIwSzgqODLUSJ1O3ElFKA5Yr5UDaRrqyRc2NrBkX2oJuG1dVYDWimJCnc69HxN2h9x304-7_d-IXR7JlyQ</recordid><startdate>201209</startdate><enddate>201209</enddate><creator>Kiss, I.</creator><creator>Badics, Z.</creator><creator>Gyimothy, S.</creator><creator>Pavo, J.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201209</creationdate><title>High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation</title><author>Kiss, I. ; Badics, Z. ; Gyimothy, S. ; Pavo, J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c289t-6a0963fb31e0e7632897d5ee559ab166fa5bc5f2de719580137ba7dcf44ab7ee3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Computational modeling</topic><topic>CUDA Computing</topic><topic>EbE FEM</topic><topic>Finite element methods</topic><topic>GPU Computing</topic><topic>Graphics processing units</topic><topic>Kernel</topic><topic>Matrix decomposition</topic><topic>parallel FEM</topic><topic>Sparse matrices</topic><topic>Vectors</topic><toplevel>online_resources</toplevel><creatorcontrib>Kiss, I.</creatorcontrib><creatorcontrib>Badics, Z.</creatorcontrib><creatorcontrib>Gyimothy, S.</creatorcontrib><creatorcontrib>Pavo, J.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kiss, I.</au><au>Badics, Z.</au><au>Gyimothy, S.</au><au>Pavo, J.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation</atitle><btitle>2012 IEEE Conference on High Performance Extreme Computing</btitle><stitle>HPEC</stitle><date>2012-09</date><risdate>2012</risdate><spage>1</spage><epage>5</epage><pages>1-5</pages><isbn>146731577X</isbn><isbn>9781467315777</isbn><eisbn>9781467315753</eisbn><eisbn>1467315745</eisbn><eisbn>1467315761</eisbn><eisbn>9781467315760</eisbn><eisbn>9781467315746</eisbn><eisbn>1467315753</eisbn><abstract>The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i.e., without assembling the global system matrix. In our implementation, NVIDIA's parallel computing solution, the Compute Unified Device Architecture (CUDA), is used to perform the required element-wise computations in parallel. Since element matrices need not be stored, the memory requirement can be kept extremely low. It is shown that this low-storage but computation-intensive technique is better suited for GPUs than those requiring the massive manipulation of large data sets. This study of the proposed parallel model illustrates a highly improved locality and minimization of data movement, which could also significantly reduce energy consumption in other heterogeneous HPC architectures.</abstract><pub>IEEE</pub><doi>10.1109/HPEC.2012.6408659</doi><tpages>5</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISBN: 146731577X |
ispartof | 2012 IEEE Conference on High Performance Extreme Computing, 2012, p.1-5 |
issn | |
language | eng |
recordid | cdi_ieee_primary_6408659 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Computational modeling CUDA Computing EbE FEM Finite element methods GPU Computing Graphics processing units Kernel Matrix decomposition parallel FEM Sparse matrices Vectors |
title | High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T07%3A06%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=High%20locality%20and%20increased%20intra-node%20parallelism%20for%20solving%20finite%20element%20models%20on%20GPUs%20by%20novel%20element-by-element%20implementation&rft.btitle=2012%20IEEE%20Conference%20on%20High%20Performance%20Extreme%20Computing&rft.au=Kiss,%20I.&rft.date=2012-09&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.isbn=146731577X&rft.isbn_list=9781467315777&rft_id=info:doi/10.1109/HPEC.2012.6408659&rft_dat=%3Cieee_6IE%3E6408659%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781467315753&rft.eisbn_list=1467315745&rft.eisbn_list=1467315761&rft.eisbn_list=9781467315760&rft.eisbn_list=9781467315746&rft.eisbn_list=1467315753&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6408659&rfr_iscdi=true |