High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation

The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kiss, I., Badics, Z., Gyimothy, S., Pavo, J.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Computational modeling CUDA Computing EbE FEM Finite element methods GPU Computing Graphics processing units Kernel Matrix decomposition parallel FEM Sparse matrices Vectors
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	5
container_issue
container_start_page	1
container_title
container_volume
creator	Kiss, I. Badics, Z. Gyimothy, S. Pavo, J.
description	The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i.e., without assembling the global system matrix. In our implementation, NVIDIA's parallel computing solution, the Compute Unified Device Architecture (CUDA), is used to perform the required element-wise computations in parallel. Since element matrices need not be stored, the memory requirement can be kept extremely low. It is shown that this low-storage but computation-intensive technique is better suited for GPUs than those requiring the massive manipulation of large data sets. This study of the proposed parallel model illustrates a highly improved locality and minimization of data movement, which could also significantly reduce energy consumption in other heterogeneous HPC architectures.
doi_str_mv	10.1109/HPEC.2012.6408659
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6408659</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6408659</ieee_id><sourcerecordid>6408659</sourcerecordid><originalsourceid>FETCH-LOGICAL-c289t-6a0963fb31e0e7632897d5ee559ab166fa5bc5f2de719580137ba7dcf44ab7ee3</originalsourceid><addsrcrecordid>eNo1kMFKAzEQhiMiqLUPIF7yAluTzSbZHKXUVijYgwVvZbI7qZFsUjZLYZ_A17ZSe5rv_-ef_zCEPHI245yZ59VmMZ-VjJczVbFaSXNFpkbXvFJacKmluCb3F6E_b8k052_G2OlWlZLdkZ-V33_RkBoIfhgpxJb62PQIGf9o6KGIqUV6gB5CwOBzR13qaU7h6OOeOh_9gBQDdhgH2p2yIdMU6XKzzdSONKYjhsu-sGNxifrucCYYfIoP5MZByDj9nxOyfV18zFfF-n35Nn9ZF01Zm6FQwIwSzgqODLUSJ1O3ElFKA5Yr5UDaRrqyRc2NrBkX2oJuG1dVYDWimJCnc69HxN2h9x304-7_d-IXR7JlyQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Kiss, I. ; Badics, Z. ; Gyimothy, S. ; Pavo, J.</creator><creatorcontrib>Kiss, I. ; Badics, Z. ; Gyimothy, S. ; Pavo, J.</creatorcontrib><description>The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i.e., without assembling the global system matrix. In our implementation, NVIDIA's parallel computing solution, the Compute Unified Device Architecture (CUDA), is used to perform the required element-wise computations in parallel. Since element matrices need not be stored, the memory requirement can be kept extremely low. It is shown that this low-storage but computation-intensive technique is better suited for GPUs than those requiring the massive manipulation of large data sets. This study of the proposed parallel model illustrates a highly improved locality and minimization of data movement, which could also significantly reduce energy consumption in other heterogeneous HPC architectures.</description><identifier>ISBN: 146731577X</identifier><identifier>ISBN: 9781467315777</identifier><identifier>EISBN: 9781467315753</identifier><identifier>EISBN: 1467315745</identifier><identifier>EISBN: 1467315761</identifier><identifier>EISBN: 9781467315760</identifier><identifier>EISBN: 9781467315746</identifier><identifier>EISBN: 1467315753</identifier><identifier>DOI: 10.1109/HPEC.2012.6408659</identifier><language>eng</language><publisher>IEEE</publisher><subject>Computational modeling ; CUDA Computing ; EbE FEM ; Finite element methods ; GPU Computing ; Graphics processing units ; Kernel ; Matrix decomposition ; parallel FEM ; Sparse matrices ; Vectors</subject><ispartof>2012 IEEE Conference on High Performance Extreme Computing, 2012, p.1-5</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c289t-6a0963fb31e0e7632897d5ee559ab166fa5bc5f2de719580137ba7dcf44ab7ee3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6408659$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2056,27916,54911</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6408659$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kiss, I.</creatorcontrib><creatorcontrib>Badics, Z.</creatorcontrib><creatorcontrib>Gyimothy, S.</creatorcontrib><creatorcontrib>Pavo, J.</creatorcontrib><title>High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation</title><title>2012 IEEE Conference on High Performance Extreme Computing</title><addtitle>HPEC</addtitle><description>The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i.e., without assembling the global system matrix. In our implementation, NVIDIA's parallel computing solution, the Compute Unified Device Architecture (CUDA), is used to perform the required element-wise computations in parallel. Since element matrices need not be stored, the memory requirement can be kept extremely low. It is shown that this low-storage but computation-intensive technique is better suited for GPUs than those requiring the massive manipulation of large data sets. This study of the proposed parallel model illustrates a highly improved locality and minimization of data movement, which could also significantly reduce energy consumption in other heterogeneous HPC architectures.</description><subject>Computational modeling</subject><subject>CUDA Computing</subject><subject>EbE FEM</subject><subject>Finite element methods</subject><subject>GPU Computing</subject><subject>Graphics processing units</subject><subject>Kernel</subject><subject>Matrix decomposition</subject><subject>parallel FEM</subject><subject>Sparse matrices</subject><subject>Vectors</subject><isbn>146731577X</isbn><isbn>9781467315777</isbn><isbn>9781467315753</isbn><isbn>1467315745</isbn><isbn>1467315761</isbn><isbn>9781467315760</isbn><isbn>9781467315746</isbn><isbn>1467315753</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1kMFKAzEQhiMiqLUPIF7yAluTzSbZHKXUVijYgwVvZbI7qZFsUjZLYZ_A17ZSe5rv_-ef_zCEPHI245yZ59VmMZ-VjJczVbFaSXNFpkbXvFJacKmluCb3F6E_b8k052_G2OlWlZLdkZ-V33_RkBoIfhgpxJb62PQIGf9o6KGIqUV6gB5CwOBzR13qaU7h6OOeOh_9gBQDdhgH2p2yIdMU6XKzzdSONKYjhsu-sGNxifrucCYYfIoP5MZByDj9nxOyfV18zFfF-n35Nn9ZF01Zm6FQwIwSzgqODLUSJ1O3ElFKA5Yr5UDaRrqyRc2NrBkX2oJuG1dVYDWimJCnc69HxN2h9x304-7_d-IXR7JlyQ</recordid><startdate>201209</startdate><enddate>201209</enddate><creator>Kiss, I.</creator><creator>Badics, Z.</creator><creator>Gyimothy, S.</creator><creator>Pavo, J.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201209</creationdate><title>High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation</title><author>Kiss, I. ; Badics, Z. ; Gyimothy, S. ; Pavo, J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c289t-6a0963fb31e0e7632897d5ee559ab166fa5bc5f2de719580137ba7dcf44ab7ee3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Computational modeling</topic><topic>CUDA Computing</topic><topic>EbE FEM</topic><topic>Finite element methods</topic><topic>GPU Computing</topic><topic>Graphics processing units</topic><topic>Kernel</topic><topic>Matrix decomposition</topic><topic>parallel FEM</topic><topic>Sparse matrices</topic><topic>Vectors</topic><toplevel>online_resources</toplevel><creatorcontrib>Kiss, I.</creatorcontrib><creatorcontrib>Badics, Z.</creatorcontrib><creatorcontrib>Gyimothy, S.</creatorcontrib><creatorcontrib>Pavo, J.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kiss, I.</au><au>Badics, Z.</au><au>Gyimothy, S.</au><au>Pavo, J.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation</atitle><btitle>2012 IEEE Conference on High Performance Extreme Computing</btitle><stitle>HPEC</stitle><date>2012-09</date><risdate>2012</risdate><spage>1</spage><epage>5</epage><pages>1-5</pages><isbn>146731577X</isbn><isbn>9781467315777</isbn><eisbn>9781467315753</eisbn><eisbn>1467315745</eisbn><eisbn>1467315761</eisbn><eisbn>9781467315760</eisbn><eisbn>9781467315746</eisbn><eisbn>1467315753</eisbn><abstract>The utilization of Graphical Processing Units (GPUs) for the element-by-element (EbE) finite element method (FEM) is demonstrated. EbE FEM is a long known technique, by which a conjugate gradient (CG) type iterative solution scheme can be entirely decomposed into computations on the element level, i.e., without assembling the global system matrix. In our implementation, NVIDIA's parallel computing solution, the Compute Unified Device Architecture (CUDA), is used to perform the required element-wise computations in parallel. Since element matrices need not be stored, the memory requirement can be kept extremely low. It is shown that this low-storage but computation-intensive technique is better suited for GPUs than those requiring the massive manipulation of large data sets. This study of the proposed parallel model illustrates a highly improved locality and minimization of data movement, which could also significantly reduce energy consumption in other heterogeneous HPC architectures.</abstract><pub>IEEE</pub><doi>10.1109/HPEC.2012.6408659</doi><tpages>5</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISBN: 146731577X
ispartof	2012 IEEE Conference on High Performance Extreme Computing, 2012, p.1-5
issn
language	eng
recordid	cdi_ieee_primary_6408659
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Computational modeling CUDA Computing EbE FEM Finite element methods GPU Computing Graphics processing units Kernel Matrix decomposition parallel FEM Sparse matrices Vectors
title	High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T07%3A06%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=High%20locality%20and%20increased%20intra-node%20parallelism%20for%20solving%20finite%20element%20models%20on%20GPUs%20by%20novel%20element-by-element%20implementation&rft.btitle=2012%20IEEE%20Conference%20on%20High%20Performance%20Extreme%20Computing&rft.au=Kiss,%20I.&rft.date=2012-09&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.isbn=146731577X&rft.isbn_list=9781467315777&rft_id=info:doi/10.1109/HPEC.2012.6408659&rft_dat=%3Cieee_6IE%3E6408659%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781467315753&rft.eisbn_list=1467315745&rft.eisbn_list=1467315761&rft.eisbn_list=9781467315760&rft.eisbn_list=9781467315746&rft.eisbn_list=1467315753&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6408659&rfr_iscdi=true