Unrolling-based optimizations for modulo scheduling

Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput module scheduled loop depends on the resource requirements, the dependence pattern, and the reg...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lavery, D.M., Hwu, W.-W.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 337
container_issue
container_start_page 327
container_title
container_volume
creator Lavery, D.M.
Hwu, W.-W.
description Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput module scheduled loop depends on the resource requirements, the dependence pattern, and the register requirements of the computation in the loop body. Traditionally, unrolling followed by acyclic scheduling of the unrolled body has been an alternative to module scheduling. However, there are benefits to unrolling even if the loop is to be module scheduled. Unrolling can improve the throughput by allowing a smaller non-integral effective initiation interval to be achieved. After unrolling, optimizations can be applied to the loop that reduce both the resource requirements and the height of the critical paths. Together, unrolling and unrolling-based optimizations can enable the completion of multiple iterations per cycle in some cases. This paper describes the benefits of unrolling and a set of optimizations for unrolled loops which have been implemented in the IMPACT compiler. The performance benefits of unrolling for five of the SPEC CFP92 programs are reported.
doi_str_mv 10.1109/MICRO.1995.476842
format Conference Proceeding
fullrecord <record><control><sourceid>proquest_6IE</sourceid><recordid>TN_cdi_ieee_primary_476842</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>476842</ieee_id><sourcerecordid>27471312</sourcerecordid><originalsourceid>FETCH-LOGICAL-i118t-c07438e8c60a0d1f2e93904b062b91078f7795c0f7bddcd8876d85e61fc3df83</originalsourceid><addsrcrecordid>eNotkMtKxEAURBtUcGb0A3SVlbvEe9Odfiwl-BgYGZBxHZJ-aEuSjulkoV9vIK6qFodDUYTcIGSIoO5f9-XbMUOliowJLll-RrYgUXJBmWLnZIMg8pSxAi_JNsYvAJBcFRtC3_sxtK3vP9KmjtYkYZh853_ryYc-Ji6MSRfM3IYk6k-7lIW8IheubqO9_s8dOT09nsqX9HB83pcPh9QjyinVIBiVVmoONRh0uVVUAWuA541a9kgnhCo0ONEYo42UghtZWI5OU-Mk3ZG7VTuM4Xu2cao6H7Vt27q3YY5VLphAivkC3q6gt9ZWw-i7evyp1h_oHzJZUcQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>27471312</pqid></control><display><type>conference_proceeding</type><title>Unrolling-based optimizations for modulo scheduling</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Lavery, D.M. ; Hwu, W.-W.</creator><creatorcontrib>Lavery, D.M. ; Hwu, W.-W.</creatorcontrib><description>Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput module scheduled loop depends on the resource requirements, the dependence pattern, and the register requirements of the computation in the loop body. Traditionally, unrolling followed by acyclic scheduling of the unrolled body has been an alternative to module scheduling. However, there are benefits to unrolling even if the loop is to be module scheduled. Unrolling can improve the throughput by allowing a smaller non-integral effective initiation interval to be achieved. After unrolling, optimizations can be applied to the loop that reduce both the resource requirements and the height of the critical paths. Together, unrolling and unrolling-based optimizations can enable the completion of multiple iterations per cycle in some cases. This paper describes the benefits of unrolling and a set of optimizations for unrolled loops which have been implemented in the IMPACT compiler. The performance benefits of unrolling for five of the SPEC CFP92 programs are reported.</description><identifier>ISSN: 1072-4451</identifier><identifier>ISBN: 0818673494</identifier><identifier>ISBN: 9780818673498</identifier><identifier>DOI: 10.1109/MICRO.1995.476842</identifier><language>eng</language><publisher>IEEE</publisher><subject>Computer aided instruction ; Delay ; Hardware ; Optimizing compilers ; Parallel processing ; Pipeline processing ; Processor scheduling ; Registers ; Scheduling algorithm ; Throughput</subject><ispartof>Proceedings of the 28th Annual International Symposium on Microarchitecture, 1995, p.327-337</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/476842$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/476842$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Lavery, D.M.</creatorcontrib><creatorcontrib>Hwu, W.-W.</creatorcontrib><title>Unrolling-based optimizations for modulo scheduling</title><title>Proceedings of the 28th Annual International Symposium on Microarchitecture</title><addtitle>MICRO</addtitle><description>Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput module scheduled loop depends on the resource requirements, the dependence pattern, and the register requirements of the computation in the loop body. Traditionally, unrolling followed by acyclic scheduling of the unrolled body has been an alternative to module scheduling. However, there are benefits to unrolling even if the loop is to be module scheduled. Unrolling can improve the throughput by allowing a smaller non-integral effective initiation interval to be achieved. After unrolling, optimizations can be applied to the loop that reduce both the resource requirements and the height of the critical paths. Together, unrolling and unrolling-based optimizations can enable the completion of multiple iterations per cycle in some cases. This paper describes the benefits of unrolling and a set of optimizations for unrolled loops which have been implemented in the IMPACT compiler. The performance benefits of unrolling for five of the SPEC CFP92 programs are reported.</description><subject>Computer aided instruction</subject><subject>Delay</subject><subject>Hardware</subject><subject>Optimizing compilers</subject><subject>Parallel processing</subject><subject>Pipeline processing</subject><subject>Processor scheduling</subject><subject>Registers</subject><subject>Scheduling algorithm</subject><subject>Throughput</subject><issn>1072-4451</issn><isbn>0818673494</isbn><isbn>9780818673498</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>1995</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotkMtKxEAURBtUcGb0A3SVlbvEe9Odfiwl-BgYGZBxHZJ-aEuSjulkoV9vIK6qFodDUYTcIGSIoO5f9-XbMUOliowJLll-RrYgUXJBmWLnZIMg8pSxAi_JNsYvAJBcFRtC3_sxtK3vP9KmjtYkYZh853_ryYc-Ji6MSRfM3IYk6k-7lIW8IheubqO9_s8dOT09nsqX9HB83pcPh9QjyinVIBiVVmoONRh0uVVUAWuA541a9kgnhCo0ONEYo42UghtZWI5OU-Mk3ZG7VTuM4Xu2cao6H7Vt27q3YY5VLphAivkC3q6gt9ZWw-i7evyp1h_oHzJZUcQ</recordid><startdate>1995</startdate><enddate>1995</enddate><creator>Lavery, D.M.</creator><creator>Hwu, W.-W.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>1995</creationdate><title>Unrolling-based optimizations for modulo scheduling</title><author>Lavery, D.M. ; Hwu, W.-W.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i118t-c07438e8c60a0d1f2e93904b062b91078f7795c0f7bddcd8876d85e61fc3df83</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>1995</creationdate><topic>Computer aided instruction</topic><topic>Delay</topic><topic>Hardware</topic><topic>Optimizing compilers</topic><topic>Parallel processing</topic><topic>Pipeline processing</topic><topic>Processor scheduling</topic><topic>Registers</topic><topic>Scheduling algorithm</topic><topic>Throughput</topic><toplevel>online_resources</toplevel><creatorcontrib>Lavery, D.M.</creatorcontrib><creatorcontrib>Hwu, W.-W.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lavery, D.M.</au><au>Hwu, W.-W.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Unrolling-based optimizations for modulo scheduling</atitle><btitle>Proceedings of the 28th Annual International Symposium on Microarchitecture</btitle><stitle>MICRO</stitle><date>1995</date><risdate>1995</risdate><spage>327</spage><epage>337</epage><pages>327-337</pages><issn>1072-4451</issn><isbn>0818673494</isbn><isbn>9780818673498</isbn><abstract>Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput module scheduled loop depends on the resource requirements, the dependence pattern, and the register requirements of the computation in the loop body. Traditionally, unrolling followed by acyclic scheduling of the unrolled body has been an alternative to module scheduling. However, there are benefits to unrolling even if the loop is to be module scheduled. Unrolling can improve the throughput by allowing a smaller non-integral effective initiation interval to be achieved. After unrolling, optimizations can be applied to the loop that reduce both the resource requirements and the height of the critical paths. Together, unrolling and unrolling-based optimizations can enable the completion of multiple iterations per cycle in some cases. This paper describes the benefits of unrolling and a set of optimizations for unrolled loops which have been implemented in the IMPACT compiler. The performance benefits of unrolling for five of the SPEC CFP92 programs are reported.</abstract><pub>IEEE</pub><doi>10.1109/MICRO.1995.476842</doi><tpages>11</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1072-4451
ispartof Proceedings of the 28th Annual International Symposium on Microarchitecture, 1995, p.327-337
issn 1072-4451
language eng
recordid cdi_ieee_primary_476842
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Computer aided instruction
Delay
Hardware
Optimizing compilers
Parallel processing
Pipeline processing
Processor scheduling
Registers
Scheduling algorithm
Throughput
title Unrolling-based optimizations for modulo scheduling
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T22%3A11%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Unrolling-based%20optimizations%20for%20modulo%20scheduling&rft.btitle=Proceedings%20of%20the%2028th%20Annual%20International%20Symposium%20on%20Microarchitecture&rft.au=Lavery,%20D.M.&rft.date=1995&rft.spage=327&rft.epage=337&rft.pages=327-337&rft.issn=1072-4451&rft.isbn=0818673494&rft.isbn_list=9780818673498&rft_id=info:doi/10.1109/MICRO.1995.476842&rft_dat=%3Cproquest_6IE%3E27471312%3C/proquest_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=27471312&rft_id=info:pmid/&rft_ieee_id=476842&rfr_iscdi=true