Spherical harmonic transform with GPUs

We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, S2HAT. We focus our attention on the two major sequential step...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2010-10
Hauptverfasser: Hupca, Ioan O, Falcou, Joel, Grigori, Laura, Stompor, Radek
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Hupca, Ioan O
Falcou, Joel
Grigori, Laura
Stompor, Radek
description We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, S2HAT. We focus our attention on the two major sequential steps involved in the transforms computation, retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the CUDA-based code and contrast them with those implemented in the Fortran90 version. We also present performance comparisons of a single CPU plus GPU unit with the S2HAT code running on either a single or 4 processors. In particular we find that use of the latest generation of GPUs, such as NVIDIA GF100 (Fermi), can accelerate the spherical harmonic transforms by as much as 18 times with respect to S2HAT executed on one core, and by as much as 5.5 with respect to S2HAT on 4 cores, with the overall performance being limited by the Fast Fourier transforms. The work presented here has been performed in the context of the Cosmic Microwave Background simulations and analysis. However, we expect that the developed software will be of more general interest and applicability.
doi_str_mv 10.48550/arxiv.1010.1260
format Article
fullrecord <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_1010_1260</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2086278436</sourcerecordid><originalsourceid>FETCH-LOGICAL-a516-39b8a8bf9b5f204555a4d739439641cf8cd7e83dae60370464d6f69559e42b053</originalsourceid><addsrcrecordid>eNotj11LwzAYhYMgOObuvZKC4F3nm7x50-RShk5hoOC8Lmmb0I71w6Tz49_bOa8OHB4O52HsisNSaiK4s-G7-VxymAouFJyxmUDkqZZCXLBFjDsAECoTRDhjt29D7UJT2n1S29D2XVMmY7Bd9H1ok69mrJP163u8ZOfe7qNb_OecbR8ftqundPOyfl7db1JLXKVoCm114U1BXoAkIiurDI1EoyQvvS6rzGmsrFOAGUglK-WVITJOigII5-z6NPvnkA-haW34yY8u-dFlAm5OwBD6j4OLY77rD6GbLuUCtBKZlqjwF1nxSco</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2086278436</pqid></control><display><type>article</type><title>Spherical harmonic transform with GPUs</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Hupca, Ioan O ; Falcou, Joel ; Grigori, Laura ; Stompor, Radek</creator><creatorcontrib>Hupca, Ioan O ; Falcou, Joel ; Grigori, Laura ; Stompor, Radek</creatorcontrib><description>We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, S2HAT. We focus our attention on the two major sequential steps involved in the transforms computation, retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the CUDA-based code and contrast them with those implemented in the Fortran90 version. We also present performance comparisons of a single CPU plus GPU unit with the S2HAT code running on either a single or 4 processors. In particular we find that use of the latest generation of GPUs, such as NVIDIA GF100 (Fermi), can accelerate the spherical harmonic transforms by as much as 18 times with respect to S2HAT executed on one core, and by as much as 5.5 with respect to S2HAT on 4 cores, with the overall performance being limited by the Fast Fourier transforms. The work presented here has been performed in the context of the Cosmic Microwave Background simulations and analysis. However, we expect that the developed software will be of more general interest and applicability.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.1010.1260</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Big Bang theory ; Computer Science - Distributed, Parallel, and Cluster Computing ; Computer simulation ; Cosmic microwave background ; Fast Fourier transformations ; FORTRAN ; Fourier transforms ; Graphics processing units ; Optimization ; Performance assessment ; Physics - Cosmology and Nongalactic Astrophysics ; Spherical harmonics</subject><ispartof>arXiv.org, 2010-10</ispartof><rights>2010. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27902</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.1010.1260$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1007/978-3-642-29737-3_40$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Hupca, Ioan O</creatorcontrib><creatorcontrib>Falcou, Joel</creatorcontrib><creatorcontrib>Grigori, Laura</creatorcontrib><creatorcontrib>Stompor, Radek</creatorcontrib><title>Spherical harmonic transform with GPUs</title><title>arXiv.org</title><description>We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, S2HAT. We focus our attention on the two major sequential steps involved in the transforms computation, retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the CUDA-based code and contrast them with those implemented in the Fortran90 version. We also present performance comparisons of a single CPU plus GPU unit with the S2HAT code running on either a single or 4 processors. In particular we find that use of the latest generation of GPUs, such as NVIDIA GF100 (Fermi), can accelerate the spherical harmonic transforms by as much as 18 times with respect to S2HAT executed on one core, and by as much as 5.5 with respect to S2HAT on 4 cores, with the overall performance being limited by the Fast Fourier transforms. The work presented here has been performed in the context of the Cosmic Microwave Background simulations and analysis. However, we expect that the developed software will be of more general interest and applicability.</description><subject>Algorithms</subject><subject>Big Bang theory</subject><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><subject>Computer simulation</subject><subject>Cosmic microwave background</subject><subject>Fast Fourier transformations</subject><subject>FORTRAN</subject><subject>Fourier transforms</subject><subject>Graphics processing units</subject><subject>Optimization</subject><subject>Performance assessment</subject><subject>Physics - Cosmology and Nongalactic Astrophysics</subject><subject>Spherical harmonics</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotj11LwzAYhYMgOObuvZKC4F3nm7x50-RShk5hoOC8Lmmb0I71w6Tz49_bOa8OHB4O52HsisNSaiK4s-G7-VxymAouFJyxmUDkqZZCXLBFjDsAECoTRDhjt29D7UJT2n1S29D2XVMmY7Bd9H1ok69mrJP163u8ZOfe7qNb_OecbR8ftqundPOyfl7db1JLXKVoCm114U1BXoAkIiurDI1EoyQvvS6rzGmsrFOAGUglK-WVITJOigII5-z6NPvnkA-haW34yY8u-dFlAm5OwBD6j4OLY77rD6GbLuUCtBKZlqjwF1nxSco</recordid><startdate>20101006</startdate><enddate>20101006</enddate><creator>Hupca, Ioan O</creator><creator>Falcou, Joel</creator><creator>Grigori, Laura</creator><creator>Stompor, Radek</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20101006</creationdate><title>Spherical harmonic transform with GPUs</title><author>Hupca, Ioan O ; Falcou, Joel ; Grigori, Laura ; Stompor, Radek</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a516-39b8a8bf9b5f204555a4d739439641cf8cd7e83dae60370464d6f69559e42b053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Algorithms</topic><topic>Big Bang theory</topic><topic>Computer Science - Distributed, Parallel, and Cluster Computing</topic><topic>Computer simulation</topic><topic>Cosmic microwave background</topic><topic>Fast Fourier transformations</topic><topic>FORTRAN</topic><topic>Fourier transforms</topic><topic>Graphics processing units</topic><topic>Optimization</topic><topic>Performance assessment</topic><topic>Physics - Cosmology and Nongalactic Astrophysics</topic><topic>Spherical harmonics</topic><toplevel>online_resources</toplevel><creatorcontrib>Hupca, Ioan O</creatorcontrib><creatorcontrib>Falcou, Joel</creatorcontrib><creatorcontrib>Grigori, Laura</creatorcontrib><creatorcontrib>Stompor, Radek</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hupca, Ioan O</au><au>Falcou, Joel</au><au>Grigori, Laura</au><au>Stompor, Radek</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Spherical harmonic transform with GPUs</atitle><jtitle>arXiv.org</jtitle><date>2010-10-06</date><risdate>2010</risdate><eissn>2331-8422</eissn><abstract>We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, S2HAT. We focus our attention on the two major sequential steps involved in the transforms computation, retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the CUDA-based code and contrast them with those implemented in the Fortran90 version. We also present performance comparisons of a single CPU plus GPU unit with the S2HAT code running on either a single or 4 processors. In particular we find that use of the latest generation of GPUs, such as NVIDIA GF100 (Fermi), can accelerate the spherical harmonic transforms by as much as 18 times with respect to S2HAT executed on one core, and by as much as 5.5 with respect to S2HAT on 4 cores, with the overall performance being limited by the Fast Fourier transforms. The work presented here has been performed in the context of the Cosmic Microwave Background simulations and analysis. However, we expect that the developed software will be of more general interest and applicability.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.1010.1260</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2010-10
issn 2331-8422
language eng
recordid cdi_arxiv_primary_1010_1260
source arXiv.org; Free E- Journals
subjects Algorithms
Big Bang theory
Computer Science - Distributed, Parallel, and Cluster Computing
Computer simulation
Cosmic microwave background
Fast Fourier transformations
FORTRAN
Fourier transforms
Graphics processing units
Optimization
Performance assessment
Physics - Cosmology and Nongalactic Astrophysics
Spherical harmonics
title Spherical harmonic transform with GPUs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T07%3A29%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Spherical%20harmonic%20transform%20with%20GPUs&rft.jtitle=arXiv.org&rft.au=Hupca,%20Ioan%20O&rft.date=2010-10-06&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.1010.1260&rft_dat=%3Cproquest_arxiv%3E2086278436%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2086278436&rft_id=info:pmid/&rfr_iscdi=true