Polyhedral parallelization of binary code

Many automatic software parallelization systems have been proposed in the past decades, but most of them are dedicated to source-to-source transformations. This paper shows that parallelizing executable programs is feasible, even if they require complex transformations, and in effect decouples paral...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on architecture and code optimization 2012-01, Vol.8 (4), p.1-21
Hauptverfasser: Pradelle, Benoit, Ketterlin, Alain, Clauss, Philippe
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 21
container_issue 4
container_start_page 1
container_title ACM transactions on architecture and code optimization
container_volume 8
creator Pradelle, Benoit
Ketterlin, Alain
Clauss, Philippe
description Many automatic software parallelization systems have been proposed in the past decades, but most of them are dedicated to source-to-source transformations. This paper shows that parallelizing executable programs is feasible, even if they require complex transformations, and in effect decouples parallelization from compilation, for example, for closed-source or legacy software, where binary code is the only available representation. We propose an automatic parallelizer, which is able to perform advanced parallelization on binary code. It first parses the binary code and extracts high-level information. From this information, a C program is generated. This program captures only a subset of the program semantics, namely, loops and memory accesses. This C program is then parallelized using existing, state-of-the-art parallelizers, including advanced polyhedral parallelizers. The original program semantics is then re-injected, and the transformed parallel loop nests are recompiled by a standard C compiler. We show on the PolyBench benchmark suite that our system successfully detects and parallelizes almost all the loop nests from the binary code, using a recent polyhedral loop parallelizer as a backend. The paper ends by elaborating a strategy to parallelize more complex programs, such as those containing non-linear accesses to memory, and provides a few example case-studies.
doi_str_mv 10.1145/2086696.2086718
format Article
fullrecord <record><control><sourceid>proquest_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_00664370v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1671580163</sourcerecordid><originalsourceid>FETCH-LOGICAL-c349t-e6c3f427c16d93f93e5c7162c4e53cbfaa007c8ea7a36af114153e5f26a9ab793</originalsourceid><addsrcrecordid>eNo9kEFLAzEQRoMoWKtnr3u0h22TTTLZHEtRKxT0oOcwm03oStrUTSvUX2-WVi_zDcPjY3iE3DM6ZUzIWUVrAA3TIRWrL8iISSFKrhW__NslwDW5SemT0kpXlI7I5C2G49q1PYZih3kGF7of3HdxW0RfNN0W-2NhY-tuyZXHkNzdOcfk4-nxfbEsV6_PL4v5qrRc6H3pwHIvKmUZtJp7zZ20ikFlhZPcNh6RUmVrhwo5oM-vM5kZXwFqbJTmYzI59a4xmF3fbfIDJmJnlvOVGW6UAgiu6DfL7MOJ3fXx6-DS3my6ZF0IuHXxkAzLJmRNGfCMzk6o7WNKvfP_3YyaQaA5CzRngfwXX9Rg6w</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1671580163</pqid></control><display><type>article</type><title>Polyhedral parallelization of binary code</title><source>ACM Digital Library Complete</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Pradelle, Benoit ; Ketterlin, Alain ; Clauss, Philippe</creator><creatorcontrib>Pradelle, Benoit ; Ketterlin, Alain ; Clauss, Philippe</creatorcontrib><description>Many automatic software parallelization systems have been proposed in the past decades, but most of them are dedicated to source-to-source transformations. This paper shows that parallelizing executable programs is feasible, even if they require complex transformations, and in effect decouples parallelization from compilation, for example, for closed-source or legacy software, where binary code is the only available representation. We propose an automatic parallelizer, which is able to perform advanced parallelization on binary code. It first parses the binary code and extracts high-level information. From this information, a C program is generated. This program captures only a subset of the program semantics, namely, loops and memory accesses. This C program is then parallelized using existing, state-of-the-art parallelizers, including advanced polyhedral parallelizers. The original program semantics is then re-injected, and the transformed parallel loop nests are recompiled by a standard C compiler. We show on the PolyBench benchmark suite that our system successfully detects and parallelizes almost all the loop nests from the binary code, using a recent polyhedral loop parallelizer as a backend. The paper ends by elaborating a strategy to parallelize more complex programs, such as those containing non-linear accesses to memory, and provides a few example case-studies.</description><identifier>ISSN: 1544-3566</identifier><identifier>EISSN: 1544-3973</identifier><identifier>DOI: 10.1145/2086696.2086718</identifier><language>eng</language><publisher>Association for Computing Machinery</publisher><subject>Architecture ; Binary codes ; Computation and Language ; Computer programs ; Computer Science ; Legacy ; Optimization ; Representations ; Software ; Transformations</subject><ispartof>ACM transactions on architecture and code optimization, 2012-01, Vol.8 (4), p.1-21</ispartof><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c349t-e6c3f427c16d93f93e5c7162c4e53cbfaa007c8ea7a36af114153e5f26a9ab793</citedby><cites>FETCH-LOGICAL-c349t-e6c3f427c16d93f93e5c7162c4e53cbfaa007c8ea7a36af114153e5f26a9ab793</cites><orcidid>0000-0002-5759-9195</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,27924,27925</link.rule.ids><backlink>$$Uhttps://inria.hal.science/hal-00664370$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Pradelle, Benoit</creatorcontrib><creatorcontrib>Ketterlin, Alain</creatorcontrib><creatorcontrib>Clauss, Philippe</creatorcontrib><title>Polyhedral parallelization of binary code</title><title>ACM transactions on architecture and code optimization</title><description>Many automatic software parallelization systems have been proposed in the past decades, but most of them are dedicated to source-to-source transformations. This paper shows that parallelizing executable programs is feasible, even if they require complex transformations, and in effect decouples parallelization from compilation, for example, for closed-source or legacy software, where binary code is the only available representation. We propose an automatic parallelizer, which is able to perform advanced parallelization on binary code. It first parses the binary code and extracts high-level information. From this information, a C program is generated. This program captures only a subset of the program semantics, namely, loops and memory accesses. This C program is then parallelized using existing, state-of-the-art parallelizers, including advanced polyhedral parallelizers. The original program semantics is then re-injected, and the transformed parallel loop nests are recompiled by a standard C compiler. We show on the PolyBench benchmark suite that our system successfully detects and parallelizes almost all the loop nests from the binary code, using a recent polyhedral loop parallelizer as a backend. The paper ends by elaborating a strategy to parallelize more complex programs, such as those containing non-linear accesses to memory, and provides a few example case-studies.</description><subject>Architecture</subject><subject>Binary codes</subject><subject>Computation and Language</subject><subject>Computer programs</subject><subject>Computer Science</subject><subject>Legacy</subject><subject>Optimization</subject><subject>Representations</subject><subject>Software</subject><subject>Transformations</subject><issn>1544-3566</issn><issn>1544-3973</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><recordid>eNo9kEFLAzEQRoMoWKtnr3u0h22TTTLZHEtRKxT0oOcwm03oStrUTSvUX2-WVi_zDcPjY3iE3DM6ZUzIWUVrAA3TIRWrL8iISSFKrhW__NslwDW5SemT0kpXlI7I5C2G49q1PYZih3kGF7of3HdxW0RfNN0W-2NhY-tuyZXHkNzdOcfk4-nxfbEsV6_PL4v5qrRc6H3pwHIvKmUZtJp7zZ20ikFlhZPcNh6RUmVrhwo5oM-vM5kZXwFqbJTmYzI59a4xmF3fbfIDJmJnlvOVGW6UAgiu6DfL7MOJ3fXx6-DS3my6ZF0IuHXxkAzLJmRNGfCMzk6o7WNKvfP_3YyaQaA5CzRngfwXX9Rg6w</recordid><startdate>20120101</startdate><enddate>20120101</enddate><creator>Pradelle, Benoit</creator><creator>Ketterlin, Alain</creator><creator>Clauss, Philippe</creator><general>Association for Computing Machinery</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>1XC</scope><orcidid>https://orcid.org/0000-0002-5759-9195</orcidid></search><sort><creationdate>20120101</creationdate><title>Polyhedral parallelization of binary code</title><author>Pradelle, Benoit ; Ketterlin, Alain ; Clauss, Philippe</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c349t-e6c3f427c16d93f93e5c7162c4e53cbfaa007c8ea7a36af114153e5f26a9ab793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Architecture</topic><topic>Binary codes</topic><topic>Computation and Language</topic><topic>Computer programs</topic><topic>Computer Science</topic><topic>Legacy</topic><topic>Optimization</topic><topic>Representations</topic><topic>Software</topic><topic>Transformations</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Pradelle, Benoit</creatorcontrib><creatorcontrib>Ketterlin, Alain</creatorcontrib><creatorcontrib>Clauss, Philippe</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Hyper Article en Ligne (HAL)</collection><jtitle>ACM transactions on architecture and code optimization</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Pradelle, Benoit</au><au>Ketterlin, Alain</au><au>Clauss, Philippe</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Polyhedral parallelization of binary code</atitle><jtitle>ACM transactions on architecture and code optimization</jtitle><date>2012-01-01</date><risdate>2012</risdate><volume>8</volume><issue>4</issue><spage>1</spage><epage>21</epage><pages>1-21</pages><issn>1544-3566</issn><eissn>1544-3973</eissn><abstract>Many automatic software parallelization systems have been proposed in the past decades, but most of them are dedicated to source-to-source transformations. This paper shows that parallelizing executable programs is feasible, even if they require complex transformations, and in effect decouples parallelization from compilation, for example, for closed-source or legacy software, where binary code is the only available representation. We propose an automatic parallelizer, which is able to perform advanced parallelization on binary code. It first parses the binary code and extracts high-level information. From this information, a C program is generated. This program captures only a subset of the program semantics, namely, loops and memory accesses. This C program is then parallelized using existing, state-of-the-art parallelizers, including advanced polyhedral parallelizers. The original program semantics is then re-injected, and the transformed parallel loop nests are recompiled by a standard C compiler. We show on the PolyBench benchmark suite that our system successfully detects and parallelizes almost all the loop nests from the binary code, using a recent polyhedral loop parallelizer as a backend. The paper ends by elaborating a strategy to parallelize more complex programs, such as those containing non-linear accesses to memory, and provides a few example case-studies.</abstract><pub>Association for Computing Machinery</pub><doi>10.1145/2086696.2086718</doi><tpages>21</tpages><orcidid>https://orcid.org/0000-0002-5759-9195</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1544-3566
ispartof ACM transactions on architecture and code optimization, 2012-01, Vol.8 (4), p.1-21
issn 1544-3566
1544-3973
language eng
recordid cdi_hal_primary_oai_HAL_hal_00664370v1
source ACM Digital Library Complete; EZB-FREE-00999 freely available EZB journals
subjects Architecture
Binary codes
Computation and Language
Computer programs
Computer Science
Legacy
Optimization
Representations
Software
Transformations
title Polyhedral parallelization of binary code
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T19%3A30%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Polyhedral%20parallelization%20of%20binary%20code&rft.jtitle=ACM%20transactions%20on%20architecture%20and%20code%20optimization&rft.au=Pradelle,%20Benoit&rft.date=2012-01-01&rft.volume=8&rft.issue=4&rft.spage=1&rft.epage=21&rft.pages=1-21&rft.issn=1544-3566&rft.eissn=1544-3973&rft_id=info:doi/10.1145/2086696.2086718&rft_dat=%3Cproquest_hal_p%3E1671580163%3C/proquest_hal_p%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1671580163&rft_id=info:pmid/&rfr_iscdi=true