Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in effici...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer physics communications 2015-09, Vol.194, p.18-32
Hauptverfasser: Nishiura, Daisuke, Furuichi, Mikito, Sakaguchi, Hide
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 32
container_issue
container_start_page 18
container_title Computer physics communications
container_volume 194
creator Nishiura, Daisuke
Furuichi, Mikito
Sakaguchi, Hide
description The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.
doi_str_mv 10.1016/j.cpc.2015.04.006
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1770292029</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S001046551500137X</els_id><sourcerecordid>1770292029</sourcerecordid><originalsourceid>FETCH-LOGICAL-c509t-31453637ab3d1b7f08dfde8d73d22d99c41eca946d33ab888195433123b198713</originalsourceid><addsrcrecordid>eNp9kLFu2zAQhomgAeo6fYBuHLtIPYqUKKJTYbRJAANZkpmgyVNNgxQVUirgt69sd-5wuOX__sN9hHxhUDNg3bdTbSdbN8DaGkQN0N2RDeulqholxAeyAWBQia5tP5JPpZwAQErFN2TcpTgts5l9Gk2gE-Yh5WhGizQN1NASU5qP6Ohk8uxtQHo8u5zceTTR20KLj0u40nQFaTmajK6KGFM-XxgTAgZqr0f8-PuB3A8mFPz8b2_J26-fr7unav_y-Lz7sa9sC2quOBMt77g0B-7YQQ7Qu8Fh7yR3TeOUsoKhNUp0jnNz6PueqVZwzhp-YKqXjG_J11vvlNP7gmXW0ReLIZgR01I0kxIa1ayzRtktanMqJeOgp-yjyWfNQF_c6pNe3eqLWw1Cr25X5vuNwfWHPx6zLtbjKs35jHbWLvn_0H8BTbiD4g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1770292029</pqid></control><display><type>article</type><title>Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing</title><source>Elsevier ScienceDirect Journals Complete</source><creator>Nishiura, Daisuke ; Furuichi, Mikito ; Sakaguchi, Hide</creator><creatorcontrib>Nishiura, Daisuke ; Furuichi, Mikito ; Sakaguchi, Hide</creatorcontrib><description>The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.</description><identifier>ISSN: 0010-4655</identifier><identifier>EISSN: 1879-2944</identifier><identifier>DOI: 10.1016/j.cpc.2015.04.006</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Architecture (computers) ; Central processing units ; Computation ; Computer simulation ; CUDA ; Devices ; GPU ; Graphics processing units ; Hydrodynamics ; MIC ; OpenMP ; Particle simulation ; Processors ; SPH</subject><ispartof>Computer physics communications, 2015-09, Vol.194, p.18-32</ispartof><rights>2015 The Authors</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c509t-31453637ab3d1b7f08dfde8d73d22d99c41eca946d33ab888195433123b198713</citedby><cites>FETCH-LOGICAL-c509t-31453637ab3d1b7f08dfde8d73d22d99c41eca946d33ab888195433123b198713</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.cpc.2015.04.006$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids></links><search><creatorcontrib>Nishiura, Daisuke</creatorcontrib><creatorcontrib>Furuichi, Mikito</creatorcontrib><creatorcontrib>Sakaguchi, Hide</creatorcontrib><title>Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing</title><title>Computer physics communications</title><description>The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.</description><subject>Architecture (computers)</subject><subject>Central processing units</subject><subject>Computation</subject><subject>Computer simulation</subject><subject>CUDA</subject><subject>Devices</subject><subject>GPU</subject><subject>Graphics processing units</subject><subject>Hydrodynamics</subject><subject>MIC</subject><subject>OpenMP</subject><subject>Particle simulation</subject><subject>Processors</subject><subject>SPH</subject><issn>0010-4655</issn><issn>1879-2944</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><recordid>eNp9kLFu2zAQhomgAeo6fYBuHLtIPYqUKKJTYbRJAANZkpmgyVNNgxQVUirgt69sd-5wuOX__sN9hHxhUDNg3bdTbSdbN8DaGkQN0N2RDeulqholxAeyAWBQia5tP5JPpZwAQErFN2TcpTgts5l9Gk2gE-Yh5WhGizQN1NASU5qP6Ohk8uxtQHo8u5zceTTR20KLj0u40nQFaTmajK6KGFM-XxgTAgZqr0f8-PuB3A8mFPz8b2_J26-fr7unav_y-Lz7sa9sC2quOBMt77g0B-7YQQ7Qu8Fh7yR3TeOUsoKhNUp0jnNz6PueqVZwzhp-YKqXjG_J11vvlNP7gmXW0ReLIZgR01I0kxIa1ayzRtktanMqJeOgp-yjyWfNQF_c6pNe3eqLWw1Cr25X5vuNwfWHPx6zLtbjKs35jHbWLvn_0H8BTbiD4g</recordid><startdate>20150901</startdate><enddate>20150901</enddate><creator>Nishiura, Daisuke</creator><creator>Furuichi, Mikito</creator><creator>Sakaguchi, Hide</creator><general>Elsevier B.V</general><scope>6I.</scope><scope>AAFTH</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7U5</scope><scope>8FD</scope><scope>H8D</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20150901</creationdate><title>Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing</title><author>Nishiura, Daisuke ; Furuichi, Mikito ; Sakaguchi, Hide</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c509t-31453637ab3d1b7f08dfde8d73d22d99c41eca946d33ab888195433123b198713</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Architecture (computers)</topic><topic>Central processing units</topic><topic>Computation</topic><topic>Computer simulation</topic><topic>CUDA</topic><topic>Devices</topic><topic>GPU</topic><topic>Graphics processing units</topic><topic>Hydrodynamics</topic><topic>MIC</topic><topic>OpenMP</topic><topic>Particle simulation</topic><topic>Processors</topic><topic>SPH</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nishiura, Daisuke</creatorcontrib><creatorcontrib>Furuichi, Mikito</creatorcontrib><creatorcontrib>Sakaguchi, Hide</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computer physics communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nishiura, Daisuke</au><au>Furuichi, Mikito</au><au>Sakaguchi, Hide</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing</atitle><jtitle>Computer physics communications</jtitle><date>2015-09-01</date><risdate>2015</risdate><volume>194</volume><spage>18</spage><epage>32</epage><pages>18-32</pages><issn>0010-4655</issn><eissn>1879-2944</eissn><abstract>The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.cpc.2015.04.006</doi><tpages>15</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0010-4655
ispartof Computer physics communications, 2015-09, Vol.194, p.18-32
issn 0010-4655
1879-2944
language eng
recordid cdi_proquest_miscellaneous_1770292029
source Elsevier ScienceDirect Journals Complete
subjects Architecture (computers)
Central processing units
Computation
Computer simulation
CUDA
Devices
GPU
Graphics processing units
Hydrodynamics
MIC
OpenMP
Particle simulation
Processors
SPH
title Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T14%3A43%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Computational%20performance%20of%20a%20smoothed%20particle%20hydrodynamics%20simulation%20for%20shared-memory%20parallel%20computing&rft.jtitle=Computer%20physics%20communications&rft.au=Nishiura,%20Daisuke&rft.date=2015-09-01&rft.volume=194&rft.spage=18&rft.epage=32&rft.pages=18-32&rft.issn=0010-4655&rft.eissn=1879-2944&rft_id=info:doi/10.1016/j.cpc.2015.04.006&rft_dat=%3Cproquest_cross%3E1770292029%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1770292029&rft_id=info:pmid/&rft_els_id=S001046551500137X&rfr_iscdi=true