Automatic code mapping on an intelligent memory architecture

This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a high-end host processor and a simpler memory processor. To achieve high performance with this type of architecture, the code needs to be partitioned and scheduled such that each secti...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on computers 2001-11, Vol.50 (11), p.1248-1266
Hauptverfasser:	Yan Solihin, Jaejin Lee, Torrellas, J.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Application software Architecture Computer architecture Computer simulation Coprocessors Delay Dynamical systems Dynamics Heterogeneity Intelligent systems Mapping Memory architecture Microprocessors Multiprocessing systems Partitioning algorithms Processor scheduling Proposals Studies
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1266
container_issue	11
container_start_page	1248
container_title	IEEE transactions on computers
container_volume	50
creator	Yan Solihin Jaejin Lee Torrellas, J.
description	This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a high-end host processor and a simpler memory processor. To achieve high performance with this type of architecture, the code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, the two processors should overlap their execution as much as possible. With our algorithm, applications are mapped fully automatically using both static and dynamic information. Using a set of standard applications and a simulated architecture, we obtain average speedups of 1.7 for numerical applications and 1.2 for nonnumerical applications over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited and represents one step toward effectively mapping code on heterogeneous intelligent memory systems.
doi_str_mv	10.1109/12.966498
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_27033842</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>966498</ieee_id><sourcerecordid>27033842</sourcerecordid><originalsourceid>FETCH-LOGICAL-c367t-bd6ae9938887052b1c8e57ea68ebce48cbd660ab89f6d515a779aae9124c8e3d3</originalsourceid><addsrcrecordid>eNqF0T1PwzAQBmALgUQpDKxMEQOIIcV24i-Jpar4kiqxwGw5zrW4auJgJ0P_PYZUDAww3XDPne70InRO8IwQrG4JnSnOSyUP0IQwJnKlGD9EE4yJzFVR4mN0EuMGY8wpVhN0Nx9635je2cz6GrLGdJ1r15lvM9Nmru1hu3VraPusgcaHXWaCfXc92H4IcIqOVmYb4Wxfp-jt4f518ZQvXx6fF_Nlbgsu-ryquQGlCimlwIxWxEpgAgyXUFkopU2AY1NJteI1I8wIoUyaILRMsqiLKboe93bBfwwQe924aNNlpgU_RK2wUBwTzJO8-lNSKQgvS_o_FLgo5De8_AU3fghteldLWUoqGS0SuhmRDT7GACvdBdeYsNME669cNKF6zCXZi9E6APhx--YnAleG7Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>884828523</pqid></control><display><type>article</type><title>Automatic code mapping on an intelligent memory architecture</title><source>IEEE Electronic Library (IEL)</source><creator>Yan Solihin ; Jaejin Lee ; Torrellas, J.</creator><creatorcontrib>Yan Solihin ; Jaejin Lee ; Torrellas, J.</creatorcontrib><description>This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a high-end host processor and a simpler memory processor. To achieve high performance with this type of architecture, the code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, the two processors should overlap their execution as much as possible. With our algorithm, applications are mapped fully automatically using both static and dynamic information. Using a set of standard applications and a simulated architecture, we obtain average speedups of 1.7 for numerical applications and 1.2 for nonnumerical applications over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited and represents one step toward effectively mapping code on heterogeneous intelligent memory systems.</description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/12.966498</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Application software ; Architecture ; Computer architecture ; Computer simulation ; Coprocessors ; Delay ; Dynamical systems ; Dynamics ; Heterogeneity ; Intelligent systems ; Mapping ; Memory architecture ; Microprocessors ; Multiprocessing systems ; Partitioning algorithms ; Processor scheduling ; Proposals ; Studies</subject><ispartof>IEEE transactions on computers, 2001-11, Vol.50 (11), p.1248-1266</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2001</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c367t-bd6ae9938887052b1c8e57ea68ebce48cbd660ab89f6d515a779aae9124c8e3d3</citedby><cites>FETCH-LOGICAL-c367t-bd6ae9938887052b1c8e57ea68ebce48cbd660ab89f6d515a779aae9124c8e3d3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/966498$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/966498$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Yan Solihin</creatorcontrib><creatorcontrib>Jaejin Lee</creatorcontrib><creatorcontrib>Torrellas, J.</creatorcontrib><title>Automatic code mapping on an intelligent memory architecture</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description>This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a high-end host processor and a simpler memory processor. To achieve high performance with this type of architecture, the code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, the two processors should overlap their execution as much as possible. With our algorithm, applications are mapped fully automatically using both static and dynamic information. Using a set of standard applications and a simulated architecture, we obtain average speedups of 1.7 for numerical applications and 1.2 for nonnumerical applications over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited and represents one step toward effectively mapping code on heterogeneous intelligent memory systems.</description><subject>Algorithms</subject><subject>Application software</subject><subject>Architecture</subject><subject>Computer architecture</subject><subject>Computer simulation</subject><subject>Coprocessors</subject><subject>Delay</subject><subject>Dynamical systems</subject><subject>Dynamics</subject><subject>Heterogeneity</subject><subject>Intelligent systems</subject><subject>Mapping</subject><subject>Memory architecture</subject><subject>Microprocessors</subject><subject>Multiprocessing systems</subject><subject>Partitioning algorithms</subject><subject>Processor scheduling</subject><subject>Proposals</subject><subject>Studies</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2001</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNqF0T1PwzAQBmALgUQpDKxMEQOIIcV24i-Jpar4kiqxwGw5zrW4auJgJ0P_PYZUDAww3XDPne70InRO8IwQrG4JnSnOSyUP0IQwJnKlGD9EE4yJzFVR4mN0EuMGY8wpVhN0Nx9635je2cz6GrLGdJ1r15lvM9Nmru1hu3VraPusgcaHXWaCfXc92H4IcIqOVmYb4Wxfp-jt4f518ZQvXx6fF_Nlbgsu-ryquQGlCimlwIxWxEpgAgyXUFkopU2AY1NJteI1I8wIoUyaILRMsqiLKboe93bBfwwQe924aNNlpgU_RK2wUBwTzJO8-lNSKQgvS_o_FLgo5De8_AU3fghteldLWUoqGS0SuhmRDT7GACvdBdeYsNME669cNKF6zCXZi9E6APhx--YnAleG7Q</recordid><startdate>20011101</startdate><enddate>20011101</enddate><creator>Yan Solihin</creator><creator>Jaejin Lee</creator><creator>Torrellas, J.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20011101</creationdate><title>Automatic code mapping on an intelligent memory architecture</title><author>Yan Solihin ; Jaejin Lee ; Torrellas, J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c367t-bd6ae9938887052b1c8e57ea68ebce48cbd660ab89f6d515a779aae9124c8e3d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2001</creationdate><topic>Algorithms</topic><topic>Application software</topic><topic>Architecture</topic><topic>Computer architecture</topic><topic>Computer simulation</topic><topic>Coprocessors</topic><topic>Delay</topic><topic>Dynamical systems</topic><topic>Dynamics</topic><topic>Heterogeneity</topic><topic>Intelligent systems</topic><topic>Mapping</topic><topic>Memory architecture</topic><topic>Microprocessors</topic><topic>Multiprocessing systems</topic><topic>Partitioning algorithms</topic><topic>Processor scheduling</topic><topic>Proposals</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yan Solihin</creatorcontrib><creatorcontrib>Jaejin Lee</creatorcontrib><creatorcontrib>Torrellas, J.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yan Solihin</au><au>Jaejin Lee</au><au>Torrellas, J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automatic code mapping on an intelligent memory architecture</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2001-11-01</date><risdate>2001</risdate><volume>50</volume><issue>11</issue><spage>1248</spage><epage>1266</epage><pages>1248-1266</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract>This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a high-end host processor and a simpler memory processor. To achieve high performance with this type of architecture, the code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, the two processors should overlap their execution as much as possible. With our algorithm, applications are mapped fully automatically using both static and dynamic information. Using a set of standard applications and a simulated architecture, we obtain average speedups of 1.7 for numerical applications and 1.2 for nonnumerical applications over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited and represents one step toward effectively mapping code on heterogeneous intelligent memory systems.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/12.966498</doi><tpages>19</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0018-9340
ispartof	IEEE transactions on computers, 2001-11, Vol.50 (11), p.1248-1266
issn	0018-9340 1557-9956
language	eng
recordid	cdi_proquest_miscellaneous_27033842
source	IEEE Electronic Library (IEL)
subjects	Algorithms Application software Architecture Computer architecture Computer simulation Coprocessors Delay Dynamical systems Dynamics Heterogeneity Intelligent systems Mapping Memory architecture Microprocessors Multiprocessing systems Partitioning algorithms Processor scheduling Proposals Studies
title	Automatic code mapping on an intelligent memory architecture
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T23%3A19%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20code%20mapping%20on%20an%20intelligent%20memory%20architecture&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Yan%20Solihin&rft.date=2001-11-01&rft.volume=50&rft.issue=11&rft.spage=1248&rft.epage=1266&rft.pages=1248-1266&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/12.966498&rft_dat=%3Cproquest_RIE%3E27033842%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=884828523&rft_id=info:pmid/&rft_ieee_id=966498&rfr_iscdi=true