Automatic code mapping on an intelligent memory architecture
This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a high-end host processor and a simpler memory processor. To achieve high performance with this type of architecture, the code needs to be partitioned and scheduled such that each secti...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on computers 2001-11, Vol.50 (11), p.1248-1266 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1266 |
---|---|
container_issue | 11 |
container_start_page | 1248 |
container_title | IEEE transactions on computers |
container_volume | 50 |
creator | Yan Solihin Jaejin Lee Torrellas, J. |
description | This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a high-end host processor and a simpler memory processor. To achieve high performance with this type of architecture, the code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, the two processors should overlap their execution as much as possible. With our algorithm, applications are mapped fully automatically using both static and dynamic information. Using a set of standard applications and a simulated architecture, we obtain average speedups of 1.7 for numerical applications and 1.2 for nonnumerical applications over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited and represents one step toward effectively mapping code on heterogeneous intelligent memory systems. |
doi_str_mv | 10.1109/12.966498 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_27033842</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>966498</ieee_id><sourcerecordid>27033842</sourcerecordid><originalsourceid>FETCH-LOGICAL-c367t-bd6ae9938887052b1c8e57ea68ebce48cbd660ab89f6d515a779aae9124c8e3d3</originalsourceid><addsrcrecordid>eNqF0T1PwzAQBmALgUQpDKxMEQOIIcV24i-Jpar4kiqxwGw5zrW4auJgJ0P_PYZUDAww3XDPne70InRO8IwQrG4JnSnOSyUP0IQwJnKlGD9EE4yJzFVR4mN0EuMGY8wpVhN0Nx9635je2cz6GrLGdJ1r15lvM9Nmru1hu3VraPusgcaHXWaCfXc92H4IcIqOVmYb4Wxfp-jt4f518ZQvXx6fF_Nlbgsu-ryquQGlCimlwIxWxEpgAgyXUFkopU2AY1NJteI1I8wIoUyaILRMsqiLKboe93bBfwwQe924aNNlpgU_RK2wUBwTzJO8-lNSKQgvS_o_FLgo5De8_AU3fghteldLWUoqGS0SuhmRDT7GACvdBdeYsNME669cNKF6zCXZi9E6APhx--YnAleG7Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>884828523</pqid></control><display><type>article</type><title>Automatic code mapping on an intelligent memory architecture</title><source>IEEE Electronic Library (IEL)</source><creator>Yan Solihin ; Jaejin Lee ; Torrellas, J.</creator><creatorcontrib>Yan Solihin ; Jaejin Lee ; Torrellas, J.</creatorcontrib><description>This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a high-end host processor and a simpler memory processor. To achieve high performance with this type of architecture, the code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, the two processors should overlap their execution as much as possible. With our algorithm, applications are mapped fully automatically using both static and dynamic information. Using a set of standard applications and a simulated architecture, we obtain average speedups of 1.7 for numerical applications and 1.2 for nonnumerical applications over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited and represents one step toward effectively mapping code on heterogeneous intelligent memory systems.</description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/12.966498</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Application software ; Architecture ; Computer architecture ; Computer simulation ; Coprocessors ; Delay ; Dynamical systems ; Dynamics ; Heterogeneity ; Intelligent systems ; Mapping ; Memory architecture ; Microprocessors ; Multiprocessing systems ; Partitioning algorithms ; Processor scheduling ; Proposals ; Studies</subject><ispartof>IEEE transactions on computers, 2001-11, Vol.50 (11), p.1248-1266</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2001</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c367t-bd6ae9938887052b1c8e57ea68ebce48cbd660ab89f6d515a779aae9124c8e3d3</citedby><cites>FETCH-LOGICAL-c367t-bd6ae9938887052b1c8e57ea68ebce48cbd660ab89f6d515a779aae9124c8e3d3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/966498$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/966498$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Yan Solihin</creatorcontrib><creatorcontrib>Jaejin Lee</creatorcontrib><creatorcontrib>Torrellas, J.</creatorcontrib><title>Automatic code mapping on an intelligent memory architecture</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description>This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a high-end host processor and a simpler memory processor. To achieve high performance with this type of architecture, the code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, the two processors should overlap their execution as much as possible. With our algorithm, applications are mapped fully automatically using both static and dynamic information. Using a set of standard applications and a simulated architecture, we obtain average speedups of 1.7 for numerical applications and 1.2 for nonnumerical applications over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited and represents one step toward effectively mapping code on heterogeneous intelligent memory systems.</description><subject>Algorithms</subject><subject>Application software</subject><subject>Architecture</subject><subject>Computer architecture</subject><subject>Computer simulation</subject><subject>Coprocessors</subject><subject>Delay</subject><subject>Dynamical systems</subject><subject>Dynamics</subject><subject>Heterogeneity</subject><subject>Intelligent systems</subject><subject>Mapping</subject><subject>Memory architecture</subject><subject>Microprocessors</subject><subject>Multiprocessing systems</subject><subject>Partitioning algorithms</subject><subject>Processor scheduling</subject><subject>Proposals</subject><subject>Studies</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2001</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNqF0T1PwzAQBmALgUQpDKxMEQOIIcV24i-Jpar4kiqxwGw5zrW4auJgJ0P_PYZUDAww3XDPne70InRO8IwQrG4JnSnOSyUP0IQwJnKlGD9EE4yJzFVR4mN0EuMGY8wpVhN0Nx9635je2cz6GrLGdJ1r15lvM9Nmru1hu3VraPusgcaHXWaCfXc92H4IcIqOVmYb4Wxfp-jt4f518ZQvXx6fF_Nlbgsu-ryquQGlCimlwIxWxEpgAgyXUFkopU2AY1NJteI1I8wIoUyaILRMsqiLKboe93bBfwwQe924aNNlpgU_RK2wUBwTzJO8-lNSKQgvS_o_FLgo5De8_AU3fghteldLWUoqGS0SuhmRDT7GACvdBdeYsNME669cNKF6zCXZi9E6APhx--YnAleG7Q</recordid><startdate>20011101</startdate><enddate>20011101</enddate><creator>Yan Solihin</creator><creator>Jaejin Lee</creator><creator>Torrellas, J.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20011101</creationdate><title>Automatic code mapping on an intelligent memory architecture</title><author>Yan Solihin ; Jaejin Lee ; Torrellas, J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c367t-bd6ae9938887052b1c8e57ea68ebce48cbd660ab89f6d515a779aae9124c8e3d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2001</creationdate><topic>Algorithms</topic><topic>Application software</topic><topic>Architecture</topic><topic>Computer architecture</topic><topic>Computer simulation</topic><topic>Coprocessors</topic><topic>Delay</topic><topic>Dynamical systems</topic><topic>Dynamics</topic><topic>Heterogeneity</topic><topic>Intelligent systems</topic><topic>Mapping</topic><topic>Memory architecture</topic><topic>Microprocessors</topic><topic>Multiprocessing systems</topic><topic>Partitioning algorithms</topic><topic>Processor scheduling</topic><topic>Proposals</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yan Solihin</creatorcontrib><creatorcontrib>Jaejin Lee</creatorcontrib><creatorcontrib>Torrellas, J.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yan Solihin</au><au>Jaejin Lee</au><au>Torrellas, J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automatic code mapping on an intelligent memory architecture</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2001-11-01</date><risdate>2001</risdate><volume>50</volume><issue>11</issue><spage>1248</spage><epage>1266</epage><pages>1248-1266</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract>This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a high-end host processor and a simpler memory processor. To achieve high performance with this type of architecture, the code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, the two processors should overlap their execution as much as possible. With our algorithm, applications are mapped fully automatically using both static and dynamic information. Using a set of standard applications and a simulated architecture, we obtain average speedups of 1.7 for numerical applications and 1.2 for nonnumerical applications over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited and represents one step toward effectively mapping code on heterogeneous intelligent memory systems.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/12.966498</doi><tpages>19</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0018-9340 |
ispartof | IEEE transactions on computers, 2001-11, Vol.50 (11), p.1248-1266 |
issn | 0018-9340 1557-9956 |
language | eng |
recordid | cdi_proquest_miscellaneous_27033842 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Application software Architecture Computer architecture Computer simulation Coprocessors Delay Dynamical systems Dynamics Heterogeneity Intelligent systems Mapping Memory architecture Microprocessors Multiprocessing systems Partitioning algorithms Processor scheduling Proposals Studies |
title | Automatic code mapping on an intelligent memory architecture |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T23%3A19%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20code%20mapping%20on%20an%20intelligent%20memory%20architecture&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Yan%20Solihin&rft.date=2001-11-01&rft.volume=50&rft.issue=11&rft.spage=1248&rft.epage=1266&rft.pages=1248-1266&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/12.966498&rft_dat=%3Cproquest_RIE%3E27033842%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=884828523&rft_id=info:pmid/&rft_ieee_id=966498&rfr_iscdi=true |