Spherical harmonic transform with GPUs
We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, S2HAT. We focus our attention on the two major sequential step...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2010-10 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Hupca, Ioan O Falcou, Joel Grigori, Laura Stompor, Radek |
description | We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, S2HAT. We focus our attention on the two major sequential steps involved in the transforms computation, retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the CUDA-based code and contrast them with those implemented in the Fortran90 version. We also present performance comparisons of a single CPU plus GPU unit with the S2HAT code running on either a single or 4 processors. In particular we find that use of the latest generation of GPUs, such as NVIDIA GF100 (Fermi), can accelerate the spherical harmonic transforms by as much as 18 times with respect to S2HAT executed on one core, and by as much as 5.5 with respect to S2HAT on 4 cores, with the overall performance being limited by the Fast Fourier transforms. The work presented here has been performed in the context of the Cosmic Microwave Background simulations and analysis. However, we expect that the developed software will be of more general interest and applicability. |
doi_str_mv | 10.48550/arxiv.1010.1260 |
format | Article |
fullrecord | <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_1010_1260</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2086278436</sourcerecordid><originalsourceid>FETCH-LOGICAL-a516-39b8a8bf9b5f204555a4d739439641cf8cd7e83dae60370464d6f69559e42b053</originalsourceid><addsrcrecordid>eNotj11LwzAYhYMgOObuvZKC4F3nm7x50-RShk5hoOC8Lmmb0I71w6Tz49_bOa8OHB4O52HsisNSaiK4s-G7-VxymAouFJyxmUDkqZZCXLBFjDsAECoTRDhjt29D7UJT2n1S29D2XVMmY7Bd9H1ok69mrJP163u8ZOfe7qNb_OecbR8ftqundPOyfl7db1JLXKVoCm114U1BXoAkIiurDI1EoyQvvS6rzGmsrFOAGUglK-WVITJOigII5-z6NPvnkA-haW34yY8u-dFlAm5OwBD6j4OLY77rD6GbLuUCtBKZlqjwF1nxSco</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2086278436</pqid></control><display><type>article</type><title>Spherical harmonic transform with GPUs</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Hupca, Ioan O ; Falcou, Joel ; Grigori, Laura ; Stompor, Radek</creator><creatorcontrib>Hupca, Ioan O ; Falcou, Joel ; Grigori, Laura ; Stompor, Radek</creatorcontrib><description>We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, S2HAT. We focus our attention on the two major sequential steps involved in the transforms computation, retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the CUDA-based code and contrast them with those implemented in the Fortran90 version. We also present performance comparisons of a single CPU plus GPU unit with the S2HAT code running on either a single or 4 processors. In particular we find that use of the latest generation of GPUs, such as NVIDIA GF100 (Fermi), can accelerate the spherical harmonic transforms by as much as 18 times with respect to S2HAT executed on one core, and by as much as 5.5 with respect to S2HAT on 4 cores, with the overall performance being limited by the Fast Fourier transforms. The work presented here has been performed in the context of the Cosmic Microwave Background simulations and analysis. However, we expect that the developed software will be of more general interest and applicability.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.1010.1260</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Big Bang theory ; Computer Science - Distributed, Parallel, and Cluster Computing ; Computer simulation ; Cosmic microwave background ; Fast Fourier transformations ; FORTRAN ; Fourier transforms ; Graphics processing units ; Optimization ; Performance assessment ; Physics - Cosmology and Nongalactic Astrophysics ; Spherical harmonics</subject><ispartof>arXiv.org, 2010-10</ispartof><rights>2010. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27902</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.1010.1260$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1007/978-3-642-29737-3_40$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Hupca, Ioan O</creatorcontrib><creatorcontrib>Falcou, Joel</creatorcontrib><creatorcontrib>Grigori, Laura</creatorcontrib><creatorcontrib>Stompor, Radek</creatorcontrib><title>Spherical harmonic transform with GPUs</title><title>arXiv.org</title><description>We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, S2HAT. We focus our attention on the two major sequential steps involved in the transforms computation, retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the CUDA-based code and contrast them with those implemented in the Fortran90 version. We also present performance comparisons of a single CPU plus GPU unit with the S2HAT code running on either a single or 4 processors. In particular we find that use of the latest generation of GPUs, such as NVIDIA GF100 (Fermi), can accelerate the spherical harmonic transforms by as much as 18 times with respect to S2HAT executed on one core, and by as much as 5.5 with respect to S2HAT on 4 cores, with the overall performance being limited by the Fast Fourier transforms. The work presented here has been performed in the context of the Cosmic Microwave Background simulations and analysis. However, we expect that the developed software will be of more general interest and applicability.</description><subject>Algorithms</subject><subject>Big Bang theory</subject><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><subject>Computer simulation</subject><subject>Cosmic microwave background</subject><subject>Fast Fourier transformations</subject><subject>FORTRAN</subject><subject>Fourier transforms</subject><subject>Graphics processing units</subject><subject>Optimization</subject><subject>Performance assessment</subject><subject>Physics - Cosmology and Nongalactic Astrophysics</subject><subject>Spherical harmonics</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotj11LwzAYhYMgOObuvZKC4F3nm7x50-RShk5hoOC8Lmmb0I71w6Tz49_bOa8OHB4O52HsisNSaiK4s-G7-VxymAouFJyxmUDkqZZCXLBFjDsAECoTRDhjt29D7UJT2n1S29D2XVMmY7Bd9H1ok69mrJP163u8ZOfe7qNb_OecbR8ftqundPOyfl7db1JLXKVoCm114U1BXoAkIiurDI1EoyQvvS6rzGmsrFOAGUglK-WVITJOigII5-z6NPvnkA-haW34yY8u-dFlAm5OwBD6j4OLY77rD6GbLuUCtBKZlqjwF1nxSco</recordid><startdate>20101006</startdate><enddate>20101006</enddate><creator>Hupca, Ioan O</creator><creator>Falcou, Joel</creator><creator>Grigori, Laura</creator><creator>Stompor, Radek</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20101006</creationdate><title>Spherical harmonic transform with GPUs</title><author>Hupca, Ioan O ; Falcou, Joel ; Grigori, Laura ; Stompor, Radek</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a516-39b8a8bf9b5f204555a4d739439641cf8cd7e83dae60370464d6f69559e42b053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Algorithms</topic><topic>Big Bang theory</topic><topic>Computer Science - Distributed, Parallel, and Cluster Computing</topic><topic>Computer simulation</topic><topic>Cosmic microwave background</topic><topic>Fast Fourier transformations</topic><topic>FORTRAN</topic><topic>Fourier transforms</topic><topic>Graphics processing units</topic><topic>Optimization</topic><topic>Performance assessment</topic><topic>Physics - Cosmology and Nongalactic Astrophysics</topic><topic>Spherical harmonics</topic><toplevel>online_resources</toplevel><creatorcontrib>Hupca, Ioan O</creatorcontrib><creatorcontrib>Falcou, Joel</creatorcontrib><creatorcontrib>Grigori, Laura</creatorcontrib><creatorcontrib>Stompor, Radek</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hupca, Ioan O</au><au>Falcou, Joel</au><au>Grigori, Laura</au><au>Stompor, Radek</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Spherical harmonic transform with GPUs</atitle><jtitle>arXiv.org</jtitle><date>2010-10-06</date><risdate>2010</risdate><eissn>2331-8422</eissn><abstract>We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, S2HAT. We focus our attention on the two major sequential steps involved in the transforms computation, retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the CUDA-based code and contrast them with those implemented in the Fortran90 version. We also present performance comparisons of a single CPU plus GPU unit with the S2HAT code running on either a single or 4 processors. In particular we find that use of the latest generation of GPUs, such as NVIDIA GF100 (Fermi), can accelerate the spherical harmonic transforms by as much as 18 times with respect to S2HAT executed on one core, and by as much as 5.5 with respect to S2HAT on 4 cores, with the overall performance being limited by the Fast Fourier transforms. The work presented here has been performed in the context of the Cosmic Microwave Background simulations and analysis. However, we expect that the developed software will be of more general interest and applicability.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.1010.1260</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2010-10 |
issn | 2331-8422 |
language | eng |
recordid | cdi_arxiv_primary_1010_1260 |
source | arXiv.org; Free E- Journals |
subjects | Algorithms Big Bang theory Computer Science - Distributed, Parallel, and Cluster Computing Computer simulation Cosmic microwave background Fast Fourier transformations FORTRAN Fourier transforms Graphics processing units Optimization Performance assessment Physics - Cosmology and Nongalactic Astrophysics Spherical harmonics |
title | Spherical harmonic transform with GPUs |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T07%3A29%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Spherical%20harmonic%20transform%20with%20GPUs&rft.jtitle=arXiv.org&rft.au=Hupca,%20Ioan%20O&rft.date=2010-10-06&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.1010.1260&rft_dat=%3Cproquest_arxiv%3E2086278436%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2086278436&rft_id=info:pmid/&rfr_iscdi=true |