Can GPU performance increase faster than the code error rate?

Graphics processing units (GPUs) are the reference architecture to accelerate high-performance computing applications and the training/interference of convolutional neural networks. For both these domains, performance and reliability are two of the main constraints. It is believed that the only way...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of supercomputing 2024-08, Vol.80 (12), p.16918-16946
Hauptverfasser: dos Santos, Fernando Fernandes, Rech, Paolo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 16946
container_issue 12
container_start_page 16918
container_title The Journal of supercomputing
container_volume 80
creator dos Santos, Fernando Fernandes
Rech, Paolo
description Graphics processing units (GPUs) are the reference architecture to accelerate high-performance computing applications and the training/interference of convolutional neural networks. For both these domains, performance and reliability are two of the main constraints. It is believed that the only way to increase reliability is to sacrifice performance, e.g., using redundancies. We show in this paper that this is not always the case. As a very promising result, we found that most GPUs performance improvements also bring the benefit of increasing the number of executions correctly completed before experiencing a silent data corruption (SDC). We consider four different common GPUs’ performance optimizations: architectural solutions, software implementations, compiler optimizations, and threads degree of parallelism. We compare different implementations of a variety of parallel codes and, through beam experiments and applications profiling, we show that the performance improvement typically (but not necessarily) increases the GPU SDC rate. Nevertheless, for the vast majority of the configurations the performance gain is much higher than the SDC rate increase, allowing to process a higher amount of correct data. As we show, the programmer choices can increase up to 25 × the number of correctly completed executions without redesigning the algorithm nor including specific hardening solutions.
doi_str_mv 10.1007/s11227-024-06119-4
format Article
fullrecord <record><control><sourceid>proquest_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_04528798v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3077092766</sourcerecordid><originalsourceid>FETCH-LOGICAL-c348t-e88e791096130ec6bde5fdb74fe8e0596acb3eb55fcdd97a069a5639938422963</originalsourceid><addsrcrecordid>eNp9kM1KAzEURoMoWKsv4CrgykX05j9ZiJSirVDQhV2HdOaObWlnajIVfHunjujOVeByvkM4hFxyuOEA9jZzLoRlIBQDw7ln6ogMuLaSgXLqmAzAC2BOK3FKznJeA4CSVg7I3TjWdPIypztMVZO2sS6QruoiYcxIq5hbTLRddlC7RFo0JVJMqUk0xRbvz8lJFTcZL37eIZk_PryOp2z2PHkaj2askMq1DJ1D6zl4wyVgYRYl6qpcWFWhQ9DexGIhcaF1VZSltxGMj9pI76VTQngjh-S69y7jJuzSahvTZ2jiKkxHs3C4gdLCWe8-eMde9ewuNe97zG1YN_tUd98LEqztQlhzMIqeKlKTc8LqV8shHJKGPmnokobvpEF1I9mPcgfXb5j-1P-svgANjXa7</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3077092766</pqid></control><display><type>article</type><title>Can GPU performance increase faster than the code error rate?</title><source>SpringerLink Journals</source><creator>dos Santos, Fernando Fernandes ; Rech, Paolo</creator><creatorcontrib>dos Santos, Fernando Fernandes ; Rech, Paolo</creatorcontrib><description>Graphics processing units (GPUs) are the reference architecture to accelerate high-performance computing applications and the training/interference of convolutional neural networks. For both these domains, performance and reliability are two of the main constraints. It is believed that the only way to increase reliability is to sacrifice performance, e.g., using redundancies. We show in this paper that this is not always the case. As a very promising result, we found that most GPUs performance improvements also bring the benefit of increasing the number of executions correctly completed before experiencing a silent data corruption (SDC). We consider four different common GPUs’ performance optimizations: architectural solutions, software implementations, compiler optimizations, and threads degree of parallelism. We compare different implementations of a variety of parallel codes and, through beam experiments and applications profiling, we show that the performance improvement typically (but not necessarily) increases the GPU SDC rate. Nevertheless, for the vast majority of the configurations the performance gain is much higher than the SDC rate increase, allowing to process a higher amount of correct data. As we show, the programmer choices can increase up to 25 × the number of correctly completed executions without redesigning the algorithm nor including specific hardening solutions.</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-024-06119-4</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Artificial neural networks ; Compilers ; Computer Science ; Configuration management ; Graphics processing units ; Hardware Architecture ; Interpreters ; Processor Architectures ; Programming Languages ; Reliability</subject><ispartof>The Journal of supercomputing, 2024-08, Vol.80 (12), p.16918-16946</ispartof><rights>The Author(s) 2024</rights><rights>The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>Attribution</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c348t-e88e791096130ec6bde5fdb74fe8e0596acb3eb55fcdd97a069a5639938422963</cites><orcidid>0000-0002-0821-1879 ; 0000-0002-3504-9862</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-024-06119-4$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-024-06119-4$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>230,314,776,780,881,27901,27902,41464,42533,51294</link.rule.ids><backlink>$$Uhttps://hal.science/hal-04528798$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>dos Santos, Fernando Fernandes</creatorcontrib><creatorcontrib>Rech, Paolo</creatorcontrib><title>Can GPU performance increase faster than the code error rate?</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>Graphics processing units (GPUs) are the reference architecture to accelerate high-performance computing applications and the training/interference of convolutional neural networks. For both these domains, performance and reliability are two of the main constraints. It is believed that the only way to increase reliability is to sacrifice performance, e.g., using redundancies. We show in this paper that this is not always the case. As a very promising result, we found that most GPUs performance improvements also bring the benefit of increasing the number of executions correctly completed before experiencing a silent data corruption (SDC). We consider four different common GPUs’ performance optimizations: architectural solutions, software implementations, compiler optimizations, and threads degree of parallelism. We compare different implementations of a variety of parallel codes and, through beam experiments and applications profiling, we show that the performance improvement typically (but not necessarily) increases the GPU SDC rate. Nevertheless, for the vast majority of the configurations the performance gain is much higher than the SDC rate increase, allowing to process a higher amount of correct data. As we show, the programmer choices can increase up to 25 × the number of correctly completed executions without redesigning the algorithm nor including specific hardening solutions.</description><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Compilers</subject><subject>Computer Science</subject><subject>Configuration management</subject><subject>Graphics processing units</subject><subject>Hardware Architecture</subject><subject>Interpreters</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><subject>Reliability</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><recordid>eNp9kM1KAzEURoMoWKsv4CrgykX05j9ZiJSirVDQhV2HdOaObWlnajIVfHunjujOVeByvkM4hFxyuOEA9jZzLoRlIBQDw7ln6ogMuLaSgXLqmAzAC2BOK3FKznJeA4CSVg7I3TjWdPIypztMVZO2sS6QruoiYcxIq5hbTLRddlC7RFo0JVJMqUk0xRbvz8lJFTcZL37eIZk_PryOp2z2PHkaj2askMq1DJ1D6zl4wyVgYRYl6qpcWFWhQ9DexGIhcaF1VZSltxGMj9pI76VTQngjh-S69y7jJuzSahvTZ2jiKkxHs3C4gdLCWe8-eMde9ewuNe97zG1YN_tUd98LEqztQlhzMIqeKlKTc8LqV8shHJKGPmnokobvpEF1I9mPcgfXb5j-1P-svgANjXa7</recordid><startdate>20240801</startdate><enddate>20240801</enddate><creator>dos Santos, Fernando Fernandes</creator><creator>Rech, Paolo</creator><general>Springer US</general><general>Springer Nature B.V</general><general>Springer Verlag</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0002-0821-1879</orcidid><orcidid>https://orcid.org/0000-0002-3504-9862</orcidid></search><sort><creationdate>20240801</creationdate><title>Can GPU performance increase faster than the code error rate?</title><author>dos Santos, Fernando Fernandes ; Rech, Paolo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c348t-e88e791096130ec6bde5fdb74fe8e0596acb3eb55fcdd97a069a5639938422963</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Compilers</topic><topic>Computer Science</topic><topic>Configuration management</topic><topic>Graphics processing units</topic><topic>Hardware Architecture</topic><topic>Interpreters</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><topic>Reliability</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>dos Santos, Fernando Fernandes</creatorcontrib><creatorcontrib>Rech, Paolo</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>dos Santos, Fernando Fernandes</au><au>Rech, Paolo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Can GPU performance increase faster than the code error rate?</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2024-08-01</date><risdate>2024</risdate><volume>80</volume><issue>12</issue><spage>16918</spage><epage>16946</epage><pages>16918-16946</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>Graphics processing units (GPUs) are the reference architecture to accelerate high-performance computing applications and the training/interference of convolutional neural networks. For both these domains, performance and reliability are two of the main constraints. It is believed that the only way to increase reliability is to sacrifice performance, e.g., using redundancies. We show in this paper that this is not always the case. As a very promising result, we found that most GPUs performance improvements also bring the benefit of increasing the number of executions correctly completed before experiencing a silent data corruption (SDC). We consider four different common GPUs’ performance optimizations: architectural solutions, software implementations, compiler optimizations, and threads degree of parallelism. We compare different implementations of a variety of parallel codes and, through beam experiments and applications profiling, we show that the performance improvement typically (but not necessarily) increases the GPU SDC rate. Nevertheless, for the vast majority of the configurations the performance gain is much higher than the SDC rate increase, allowing to process a higher amount of correct data. As we show, the programmer choices can increase up to 25 × the number of correctly completed executions without redesigning the algorithm nor including specific hardening solutions.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-024-06119-4</doi><tpages>29</tpages><orcidid>https://orcid.org/0000-0002-0821-1879</orcidid><orcidid>https://orcid.org/0000-0002-3504-9862</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0920-8542
ispartof The Journal of supercomputing, 2024-08, Vol.80 (12), p.16918-16946
issn 0920-8542
1573-0484
language eng
recordid cdi_hal_primary_oai_HAL_hal_04528798v1
source SpringerLink Journals
subjects Algorithms
Artificial neural networks
Compilers
Computer Science
Configuration management
Graphics processing units
Hardware Architecture
Interpreters
Processor Architectures
Programming Languages
Reliability
title Can GPU performance increase faster than the code error rate?
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T03%3A19%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Can%20GPU%20performance%20increase%20faster%20than%20the%20code%20error%20rate?&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=dos%20Santos,%20Fernando%20Fernandes&rft.date=2024-08-01&rft.volume=80&rft.issue=12&rft.spage=16918&rft.epage=16946&rft.pages=16918-16946&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-024-06119-4&rft_dat=%3Cproquest_hal_p%3E3077092766%3C/proquest_hal_p%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3077092766&rft_id=info:pmid/&rfr_iscdi=true