Exploring Multi-Reader Buffers and Channel Placement during Dataflow Network Mapping to Heterogeneous Many-core Systems
This paper presents an approach for reducing the memory requirements of periodically executed dataflow applications, while minimizing the period when deployed on a many-core target. Often, implementations of dataflow applications suffer from data duplication if identical data has to be processed by...
Gespeichert in:
Veröffentlicht in: | IEEE access 2024-01, Vol.12, p.1-1 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | IEEE access |
container_volume | 12 |
creator | Letras, Martin Falk, Joachim Teich, Jurgen |
description | This paper presents an approach for reducing the memory requirements of periodically executed dataflow applications, while minimizing the period when deployed on a many-core target. Often, implementations of dataflow applications suffer from data duplication if identical data has to be processed by multiple actors. In fact, multi-cast (also called fork) actors can produce huge memory overheads when storing and communicating copies of the same data. As a remedy, so-called Multi-Reader Buffers (MRBs) can be utilized to forward identical data to multiple actors in a First In First Out (FIFO) manner while storing each data item only once by sharing. However, using MRBs may increase the achievable period due to contention when accessing the shared data. This paper proposes a novel multi-objective design space exploration approach that selectively replaces multi-cast actors with MRBs and explores actor and FIFO channel mappings to find trade-offs between the objectives of period, memory footprint, and core cost. In distinction to the state-of-the-art, our approach considers (i) memory-size constraints for on-chip memories, (ii) hierarchical memories to implement the buffers, e.g., tile-local memories, (iii) supports heterogeneous many-core platforms, i.e., core-type dependent actor execution times, and (iv) optimizes the buffer placement and overall scheduling to minimize the execution period by proposing a novel combined actor and communication scheduling heuristic for period minimization called Communication-Aware Periodic Scheduling on Heterogeneous Many-core Systems (CAPS-HMS) . Our results show that the explored Pareto fronts improve a hypervolume indicator over a reference approach by up to 66% for small to mid-size applications and 90% for large applications. Moreover, selectively replacing multi-cast actors with corresponding MRBs proves to be always superior to never or always replacing them. Finally, it is shown that the quality of the explored Pareto fronts does not degrade when replacing the efficient scheduling heuristic CAPS-HMS by an exact Integer Linear Program (ILP) solver that requires orders of magnitude higher solver times and thus cannot be applied to large scale dataflow network problems. |
doi_str_mv | 10.1109/ACCESS.2024.3375079 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2969056009</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10463021</ieee_id><doaj_id>oai_doaj_org_article_572379ddeeee4da685128d4115cdbc1c</doaj_id><sourcerecordid>2969056009</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-ae47804d2d0c5c7b9fd0865777881a1b06215c2f403a10d769656300dc10f9e3</originalsourceid><addsrcrecordid>eNpNUU1v2zAMNYYNWNH2F2wHAT0704clWcfWS9cC_RiW3gVFojOnjuVKMrL8-yl1UZQXEiTfeyReUXwjeEEIVj8um2a5Wi0optWCMcmxVJ-KE0qEKhln4vOH-mtxHuMW56hzi8uTYr_8N_Y-dMMG3U996so_YBwEdDW1LYSIzOBQ89cMA_Tod28s7GBIyE2viJ8mmbb3e_QAae_DM7o343gcJI9uIEHwGxjATzEPhkNpfQC0OsQEu3hWfGlNH-H8LZ8WT9fLp-amvHv8ddtc3pWWcZVKA5WsceWow5ZbuVatw7XgUsq6JoassaCEW9pWmBmCnRRKcMEwdpbgVgE7LW5nWufNVo-h25lw0N50-rXhw0abkDrbg-aSMqmcgxyVM6LmhNauIpnfrS2xmeti5hqDf5kgJr31Uxjy9ZoqoTAXGKu8xeYtG3yMAdp3VYL10S89-6WPfuk3vzLq-4zqsvoHRJWfoYT9B8N7kdM</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2969056009</pqid></control><display><type>article</type><title>Exploring Multi-Reader Buffers and Channel Placement during Dataflow Network Mapping to Heterogeneous Many-core Systems</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Letras, Martin ; Falk, Joachim ; Teich, Jurgen</creator><creatorcontrib>Letras, Martin ; Falk, Joachim ; Teich, Jurgen</creatorcontrib><description>This paper presents an approach for reducing the memory requirements of periodically executed dataflow applications, while minimizing the period when deployed on a many-core target. Often, implementations of dataflow applications suffer from data duplication if identical data has to be processed by multiple actors. In fact, multi-cast (also called fork) actors can produce huge memory overheads when storing and communicating copies of the same data. As a remedy, so-called Multi-Reader Buffers (MRBs) can be utilized to forward identical data to multiple actors in a First In First Out (FIFO) manner while storing each data item only once by sharing. However, using MRBs may increase the achievable period due to contention when accessing the shared data. This paper proposes a novel multi-objective design space exploration approach that selectively replaces multi-cast actors with MRBs and explores actor and FIFO channel mappings to find trade-offs between the objectives of period, memory footprint, and core cost. In distinction to the state-of-the-art, our approach considers (i) memory-size constraints for on-chip memories, (ii) hierarchical memories to implement the buffers, e.g., tile-local memories, (iii) supports heterogeneous many-core platforms, i.e., core-type dependent actor execution times, and (iv) optimizes the buffer placement and overall scheduling to minimize the execution period by proposing a novel combined actor and communication scheduling heuristic for period minimization called Communication-Aware Periodic Scheduling on Heterogeneous Many-core Systems (CAPS-HMS) . Our results show that the explored Pareto fronts improve a hypervolume indicator over a reference approach by up to 66% for small to mid-size applications and 90% for large applications. Moreover, selectively replacing multi-cast actors with corresponding MRBs proves to be always superior to never or always replacing them. Finally, it is shown that the quality of the explored Pareto fronts does not degrade when replacing the efficient scheduling heuristic CAPS-HMS by an exact Integer Linear Program (ILP) solver that requires orders of magnitude higher solver times and thus cannot be applied to large scale dataflow network problems.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2024.3375079</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Buffers ; Communication ; Dataflow Networks ; FIFO ; Firing ; Heuristic ; Image processing ; Integer programming ; Many-Core Systems ; Mapping ; Memory Management ; Minimization ; Modulo Scheduling ; Pareto Optimization ; Placement ; Processor scheduling ; Schedules ; Scheduling ; Solvers ; Space exploration ; Throughput</subject><ispartof>IEEE access, 2024-01, Vol.12, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c359t-ae47804d2d0c5c7b9fd0865777881a1b06215c2f403a10d769656300dc10f9e3</cites><orcidid>0000-0001-6285-5862 ; 0000-0002-1429-8982 ; 0009-0006-0834-3237</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10463021$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2100,27624,27915,27916,54924</link.rule.ids></links><search><creatorcontrib>Letras, Martin</creatorcontrib><creatorcontrib>Falk, Joachim</creatorcontrib><creatorcontrib>Teich, Jurgen</creatorcontrib><title>Exploring Multi-Reader Buffers and Channel Placement during Dataflow Network Mapping to Heterogeneous Many-core Systems</title><title>IEEE access</title><addtitle>Access</addtitle><description>This paper presents an approach for reducing the memory requirements of periodically executed dataflow applications, while minimizing the period when deployed on a many-core target. Often, implementations of dataflow applications suffer from data duplication if identical data has to be processed by multiple actors. In fact, multi-cast (also called fork) actors can produce huge memory overheads when storing and communicating copies of the same data. As a remedy, so-called Multi-Reader Buffers (MRBs) can be utilized to forward identical data to multiple actors in a First In First Out (FIFO) manner while storing each data item only once by sharing. However, using MRBs may increase the achievable period due to contention when accessing the shared data. This paper proposes a novel multi-objective design space exploration approach that selectively replaces multi-cast actors with MRBs and explores actor and FIFO channel mappings to find trade-offs between the objectives of period, memory footprint, and core cost. In distinction to the state-of-the-art, our approach considers (i) memory-size constraints for on-chip memories, (ii) hierarchical memories to implement the buffers, e.g., tile-local memories, (iii) supports heterogeneous many-core platforms, i.e., core-type dependent actor execution times, and (iv) optimizes the buffer placement and overall scheduling to minimize the execution period by proposing a novel combined actor and communication scheduling heuristic for period minimization called Communication-Aware Periodic Scheduling on Heterogeneous Many-core Systems (CAPS-HMS) . Our results show that the explored Pareto fronts improve a hypervolume indicator over a reference approach by up to 66% for small to mid-size applications and 90% for large applications. Moreover, selectively replacing multi-cast actors with corresponding MRBs proves to be always superior to never or always replacing them. Finally, it is shown that the quality of the explored Pareto fronts does not degrade when replacing the efficient scheduling heuristic CAPS-HMS by an exact Integer Linear Program (ILP) solver that requires orders of magnitude higher solver times and thus cannot be applied to large scale dataflow network problems.</description><subject>Buffers</subject><subject>Communication</subject><subject>Dataflow Networks</subject><subject>FIFO</subject><subject>Firing</subject><subject>Heuristic</subject><subject>Image processing</subject><subject>Integer programming</subject><subject>Many-Core Systems</subject><subject>Mapping</subject><subject>Memory Management</subject><subject>Minimization</subject><subject>Modulo Scheduling</subject><subject>Pareto Optimization</subject><subject>Placement</subject><subject>Processor scheduling</subject><subject>Schedules</subject><subject>Scheduling</subject><subject>Solvers</subject><subject>Space exploration</subject><subject>Throughput</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUU1v2zAMNYYNWNH2F2wHAT0704clWcfWS9cC_RiW3gVFojOnjuVKMrL8-yl1UZQXEiTfeyReUXwjeEEIVj8um2a5Wi0optWCMcmxVJ-KE0qEKhln4vOH-mtxHuMW56hzi8uTYr_8N_Y-dMMG3U996so_YBwEdDW1LYSIzOBQ89cMA_Tod28s7GBIyE2viJ8mmbb3e_QAae_DM7o343gcJI9uIEHwGxjATzEPhkNpfQC0OsQEu3hWfGlNH-H8LZ8WT9fLp-amvHv8ddtc3pWWcZVKA5WsceWow5ZbuVatw7XgUsq6JoassaCEW9pWmBmCnRRKcMEwdpbgVgE7LW5nWufNVo-h25lw0N50-rXhw0abkDrbg-aSMqmcgxyVM6LmhNauIpnfrS2xmeti5hqDf5kgJr31Uxjy9ZoqoTAXGKu8xeYtG3yMAdp3VYL10S89-6WPfuk3vzLq-4zqsvoHRJWfoYT9B8N7kdM</recordid><startdate>20240101</startdate><enddate>20240101</enddate><creator>Letras, Martin</creator><creator>Falk, Joachim</creator><creator>Teich, Jurgen</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-6285-5862</orcidid><orcidid>https://orcid.org/0000-0002-1429-8982</orcidid><orcidid>https://orcid.org/0009-0006-0834-3237</orcidid></search><sort><creationdate>20240101</creationdate><title>Exploring Multi-Reader Buffers and Channel Placement during Dataflow Network Mapping to Heterogeneous Many-core Systems</title><author>Letras, Martin ; Falk, Joachim ; Teich, Jurgen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-ae47804d2d0c5c7b9fd0865777881a1b06215c2f403a10d769656300dc10f9e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Buffers</topic><topic>Communication</topic><topic>Dataflow Networks</topic><topic>FIFO</topic><topic>Firing</topic><topic>Heuristic</topic><topic>Image processing</topic><topic>Integer programming</topic><topic>Many-Core Systems</topic><topic>Mapping</topic><topic>Memory Management</topic><topic>Minimization</topic><topic>Modulo Scheduling</topic><topic>Pareto Optimization</topic><topic>Placement</topic><topic>Processor scheduling</topic><topic>Schedules</topic><topic>Scheduling</topic><topic>Solvers</topic><topic>Space exploration</topic><topic>Throughput</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Letras, Martin</creatorcontrib><creatorcontrib>Falk, Joachim</creatorcontrib><creatorcontrib>Teich, Jurgen</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Letras, Martin</au><au>Falk, Joachim</au><au>Teich, Jurgen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Exploring Multi-Reader Buffers and Channel Placement during Dataflow Network Mapping to Heterogeneous Many-core Systems</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2024-01-01</date><risdate>2024</risdate><volume>12</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>This paper presents an approach for reducing the memory requirements of periodically executed dataflow applications, while minimizing the period when deployed on a many-core target. Often, implementations of dataflow applications suffer from data duplication if identical data has to be processed by multiple actors. In fact, multi-cast (also called fork) actors can produce huge memory overheads when storing and communicating copies of the same data. As a remedy, so-called Multi-Reader Buffers (MRBs) can be utilized to forward identical data to multiple actors in a First In First Out (FIFO) manner while storing each data item only once by sharing. However, using MRBs may increase the achievable period due to contention when accessing the shared data. This paper proposes a novel multi-objective design space exploration approach that selectively replaces multi-cast actors with MRBs and explores actor and FIFO channel mappings to find trade-offs between the objectives of period, memory footprint, and core cost. In distinction to the state-of-the-art, our approach considers (i) memory-size constraints for on-chip memories, (ii) hierarchical memories to implement the buffers, e.g., tile-local memories, (iii) supports heterogeneous many-core platforms, i.e., core-type dependent actor execution times, and (iv) optimizes the buffer placement and overall scheduling to minimize the execution period by proposing a novel combined actor and communication scheduling heuristic for period minimization called Communication-Aware Periodic Scheduling on Heterogeneous Many-core Systems (CAPS-HMS) . Our results show that the explored Pareto fronts improve a hypervolume indicator over a reference approach by up to 66% for small to mid-size applications and 90% for large applications. Moreover, selectively replacing multi-cast actors with corresponding MRBs proves to be always superior to never or always replacing them. Finally, it is shown that the quality of the explored Pareto fronts does not degrade when replacing the efficient scheduling heuristic CAPS-HMS by an exact Integer Linear Program (ILP) solver that requires orders of magnitude higher solver times and thus cannot be applied to large scale dataflow network problems.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2024.3375079</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-6285-5862</orcidid><orcidid>https://orcid.org/0000-0002-1429-8982</orcidid><orcidid>https://orcid.org/0009-0006-0834-3237</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2024-01, Vol.12, p.1-1 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_proquest_journals_2969056009 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Buffers Communication Dataflow Networks FIFO Firing Heuristic Image processing Integer programming Many-Core Systems Mapping Memory Management Minimization Modulo Scheduling Pareto Optimization Placement Processor scheduling Schedules Scheduling Solvers Space exploration Throughput |
title | Exploring Multi-Reader Buffers and Channel Placement during Dataflow Network Mapping to Heterogeneous Many-core Systems |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T04%3A23%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Exploring%20Multi-Reader%20Buffers%20and%20Channel%20Placement%20during%20Dataflow%20Network%20Mapping%20to%20Heterogeneous%20Many-core%20Systems&rft.jtitle=IEEE%20access&rft.au=Letras,%20Martin&rft.date=2024-01-01&rft.volume=12&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2024.3375079&rft_dat=%3Cproquest_cross%3E2969056009%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2969056009&rft_id=info:pmid/&rft_ieee_id=10463021&rft_doaj_id=oai_doaj_org_article_572379ddeeee4da685128d4115cdbc1c&rfr_iscdi=true |