Unrolling-based optimizations for modulo scheduling

Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput module scheduled loop depends on the resource requirements, the dependence pattern, and the reg...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Lavery, D.M., Hwu, W.-W.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Computer aided instruction Delay Hardware Optimizing compilers Parallel processing Pipeline processing Processor scheduling Registers Scheduling algorithm Throughput
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	337
container_issue
container_start_page	327
container_title
container_volume
creator	Lavery, D.M. Hwu, W.-W.
description	Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput module scheduled loop depends on the resource requirements, the dependence pattern, and the register requirements of the computation in the loop body. Traditionally, unrolling followed by acyclic scheduling of the unrolled body has been an alternative to module scheduling. However, there are benefits to unrolling even if the loop is to be module scheduled. Unrolling can improve the throughput by allowing a smaller non-integral effective initiation interval to be achieved. After unrolling, optimizations can be applied to the loop that reduce both the resource requirements and the height of the critical paths. Together, unrolling and unrolling-based optimizations can enable the completion of multiple iterations per cycle in some cases. This paper describes the benefits of unrolling and a set of optimizations for unrolled loops which have been implemented in the IMPACT compiler. The performance benefits of unrolling for five of the SPEC CFP92 programs are reported.
doi_str_mv	10.1109/MICRO.1995.476842
format	Conference Proceeding
fullrecord	<record><control><sourceid>proquest_6IE</sourceid><recordid>TN_cdi_ieee_primary_476842</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>476842</ieee_id><sourcerecordid>27471312</sourcerecordid><originalsourceid>FETCH-LOGICAL-i118t-c07438e8c60a0d1f2e93904b062b91078f7795c0f7bddcd8876d85e61fc3df83</originalsourceid><addsrcrecordid>eNotkMtKxEAURBtUcGb0A3SVlbvEe9Odfiwl-BgYGZBxHZJ-aEuSjulkoV9vIK6qFodDUYTcIGSIoO5f9-XbMUOliowJLll-RrYgUXJBmWLnZIMg8pSxAi_JNsYvAJBcFRtC3_sxtK3vP9KmjtYkYZh853_ryYc-Ji6MSRfM3IYk6k-7lIW8IheubqO9_s8dOT09nsqX9HB83pcPh9QjyinVIBiVVmoONRh0uVVUAWuA541a9kgnhCo0ONEYo42UghtZWI5OU-Mk3ZG7VTuM4Xu2cao6H7Vt27q3YY5VLphAivkC3q6gt9ZWw-i7evyp1h_oHzJZUcQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>27471312</pqid></control><display><type>conference_proceeding</type><title>Unrolling-based optimizations for modulo scheduling</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Lavery, D.M. ; Hwu, W.-W.</creator><creatorcontrib>Lavery, D.M. ; Hwu, W.-W.</creatorcontrib><description>Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput module scheduled loop depends on the resource requirements, the dependence pattern, and the register requirements of the computation in the loop body. Traditionally, unrolling followed by acyclic scheduling of the unrolled body has been an alternative to module scheduling. However, there are benefits to unrolling even if the loop is to be module scheduled. Unrolling can improve the throughput by allowing a smaller non-integral effective initiation interval to be achieved. After unrolling, optimizations can be applied to the loop that reduce both the resource requirements and the height of the critical paths. Together, unrolling and unrolling-based optimizations can enable the completion of multiple iterations per cycle in some cases. This paper describes the benefits of unrolling and a set of optimizations for unrolled loops which have been implemented in the IMPACT compiler. The performance benefits of unrolling for five of the SPEC CFP92 programs are reported.</description><identifier>ISSN: 1072-4451</identifier><identifier>ISBN: 0818673494</identifier><identifier>ISBN: 9780818673498</identifier><identifier>DOI: 10.1109/MICRO.1995.476842</identifier><language>eng</language><publisher>IEEE</publisher><subject>Computer aided instruction ; Delay ; Hardware ; Optimizing compilers ; Parallel processing ; Pipeline processing ; Processor scheduling ; Registers ; Scheduling algorithm ; Throughput</subject><ispartof>Proceedings of the 28th Annual International Symposium on Microarchitecture, 1995, p.327-337</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/476842$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/476842$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Lavery, D.M.</creatorcontrib><creatorcontrib>Hwu, W.-W.</creatorcontrib><title>Unrolling-based optimizations for modulo scheduling</title><title>Proceedings of the 28th Annual International Symposium on Microarchitecture</title><addtitle>MICRO</addtitle><description>Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput module scheduled loop depends on the resource requirements, the dependence pattern, and the register requirements of the computation in the loop body. Traditionally, unrolling followed by acyclic scheduling of the unrolled body has been an alternative to module scheduling. However, there are benefits to unrolling even if the loop is to be module scheduled. Unrolling can improve the throughput by allowing a smaller non-integral effective initiation interval to be achieved. After unrolling, optimizations can be applied to the loop that reduce both the resource requirements and the height of the critical paths. Together, unrolling and unrolling-based optimizations can enable the completion of multiple iterations per cycle in some cases. This paper describes the benefits of unrolling and a set of optimizations for unrolled loops which have been implemented in the IMPACT compiler. The performance benefits of unrolling for five of the SPEC CFP92 programs are reported.</description><subject>Computer aided instruction</subject><subject>Delay</subject><subject>Hardware</subject><subject>Optimizing compilers</subject><subject>Parallel processing</subject><subject>Pipeline processing</subject><subject>Processor scheduling</subject><subject>Registers</subject><subject>Scheduling algorithm</subject><subject>Throughput</subject><issn>1072-4451</issn><isbn>0818673494</isbn><isbn>9780818673498</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>1995</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotkMtKxEAURBtUcGb0A3SVlbvEe9Odfiwl-BgYGZBxHZJ-aEuSjulkoV9vIK6qFodDUYTcIGSIoO5f9-XbMUOliowJLll-RrYgUXJBmWLnZIMg8pSxAi_JNsYvAJBcFRtC3_sxtK3vP9KmjtYkYZh853_ryYc-Ji6MSRfM3IYk6k-7lIW8IheubqO9_s8dOT09nsqX9HB83pcPh9QjyinVIBiVVmoONRh0uVVUAWuA541a9kgnhCo0ONEYo42UghtZWI5OU-Mk3ZG7VTuM4Xu2cao6H7Vt27q3YY5VLphAivkC3q6gt9ZWw-i7evyp1h_oHzJZUcQ</recordid><startdate>1995</startdate><enddate>1995</enddate><creator>Lavery, D.M.</creator><creator>Hwu, W.-W.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>1995</creationdate><title>Unrolling-based optimizations for modulo scheduling</title><author>Lavery, D.M. ; Hwu, W.-W.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i118t-c07438e8c60a0d1f2e93904b062b91078f7795c0f7bddcd8876d85e61fc3df83</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>1995</creationdate><topic>Computer aided instruction</topic><topic>Delay</topic><topic>Hardware</topic><topic>Optimizing compilers</topic><topic>Parallel processing</topic><topic>Pipeline processing</topic><topic>Processor scheduling</topic><topic>Registers</topic><topic>Scheduling algorithm</topic><topic>Throughput</topic><toplevel>online_resources</toplevel><creatorcontrib>Lavery, D.M.</creatorcontrib><creatorcontrib>Hwu, W.-W.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lavery, D.M.</au><au>Hwu, W.-W.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Unrolling-based optimizations for modulo scheduling</atitle><btitle>Proceedings of the 28th Annual International Symposium on Microarchitecture</btitle><stitle>MICRO</stitle><date>1995</date><risdate>1995</risdate><spage>327</spage><epage>337</epage><pages>327-337</pages><issn>1072-4451</issn><isbn>0818673494</isbn><isbn>9780818673498</isbn><abstract>Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput module scheduled loop depends on the resource requirements, the dependence pattern, and the register requirements of the computation in the loop body. Traditionally, unrolling followed by acyclic scheduling of the unrolled body has been an alternative to module scheduling. However, there are benefits to unrolling even if the loop is to be module scheduled. Unrolling can improve the throughput by allowing a smaller non-integral effective initiation interval to be achieved. After unrolling, optimizations can be applied to the loop that reduce both the resource requirements and the height of the critical paths. Together, unrolling and unrolling-based optimizations can enable the completion of multiple iterations per cycle in some cases. This paper describes the benefits of unrolling and a set of optimizations for unrolled loops which have been implemented in the IMPACT compiler. The performance benefits of unrolling for five of the SPEC CFP92 programs are reported.</abstract><pub>IEEE</pub><doi>10.1109/MICRO.1995.476842</doi><tpages>11</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1072-4451
ispartof	Proceedings of the 28th Annual International Symposium on Microarchitecture, 1995, p.327-337
issn	1072-4451
language	eng
recordid	cdi_ieee_primary_476842
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Computer aided instruction Delay Hardware Optimizing compilers Parallel processing Pipeline processing Processor scheduling Registers Scheduling algorithm Throughput
title	Unrolling-based optimizations for modulo scheduling
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T22%3A11%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Unrolling-based%20optimizations%20for%20modulo%20scheduling&rft.btitle=Proceedings%20of%20the%2028th%20Annual%20International%20Symposium%20on%20Microarchitecture&rft.au=Lavery,%20D.M.&rft.date=1995&rft.spage=327&rft.epage=337&rft.pages=327-337&rft.issn=1072-4451&rft.isbn=0818673494&rft.isbn_list=9780818673498&rft_id=info:doi/10.1109/MICRO.1995.476842&rft_dat=%3Cproquest_6IE%3E27471312%3C/proquest_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=27471312&rft_id=info:pmid/&rft_ieee_id=476842&rfr_iscdi=true