Exploring Multi-Reader Buffers and Channel Placement during Dataflow Network Mapping to Heterogeneous Many-core Systems

This paper presents an approach for reducing the memory requirements of periodically executed dataflow applications, while minimizing the period when deployed on a many-core target. Often, implementations of dataflow applications suffer from data duplication if identical data has to be processed by...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2024-01, Vol.12, p.1-1
Hauptverfasser: Letras, Martin, Falk, Joachim, Teich, Jurgen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1
container_issue
container_start_page 1
container_title IEEE access
container_volume 12
creator Letras, Martin
Falk, Joachim
Teich, Jurgen
description This paper presents an approach for reducing the memory requirements of periodically executed dataflow applications, while minimizing the period when deployed on a many-core target. Often, implementations of dataflow applications suffer from data duplication if identical data has to be processed by multiple actors. In fact, multi-cast (also called fork) actors can produce huge memory overheads when storing and communicating copies of the same data. As a remedy, so-called Multi-Reader Buffers (MRBs) can be utilized to forward identical data to multiple actors in a First In First Out (FIFO) manner while storing each data item only once by sharing. However, using MRBs may increase the achievable period due to contention when accessing the shared data. This paper proposes a novel multi-objective design space exploration approach that selectively replaces multi-cast actors with MRBs and explores actor and FIFO channel mappings to find trade-offs between the objectives of period, memory footprint, and core cost. In distinction to the state-of-the-art, our approach considers (i) memory-size constraints for on-chip memories, (ii) hierarchical memories to implement the buffers, e.g., tile-local memories, (iii) supports heterogeneous many-core platforms, i.e., core-type dependent actor execution times, and (iv) optimizes the buffer placement and overall scheduling to minimize the execution period by proposing a novel combined actor and communication scheduling heuristic for period minimization called Communication-Aware Periodic Scheduling on Heterogeneous Many-core Systems (CAPS-HMS) . Our results show that the explored Pareto fronts improve a hypervolume indicator over a reference approach by up to 66% for small to mid-size applications and 90% for large applications. Moreover, selectively replacing multi-cast actors with corresponding MRBs proves to be always superior to never or always replacing them. Finally, it is shown that the quality of the explored Pareto fronts does not degrade when replacing the efficient scheduling heuristic CAPS-HMS by an exact Integer Linear Program (ILP) solver that requires orders of magnitude higher solver times and thus cannot be applied to large scale dataflow network problems.
doi_str_mv 10.1109/ACCESS.2024.3375079
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2969056009</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10463021</ieee_id><doaj_id>oai_doaj_org_article_572379ddeeee4da685128d4115cdbc1c</doaj_id><sourcerecordid>2969056009</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-ae47804d2d0c5c7b9fd0865777881a1b06215c2f403a10d769656300dc10f9e3</originalsourceid><addsrcrecordid>eNpNUU1v2zAMNYYNWNH2F2wHAT0704clWcfWS9cC_RiW3gVFojOnjuVKMrL8-yl1UZQXEiTfeyReUXwjeEEIVj8um2a5Wi0optWCMcmxVJ-KE0qEKhln4vOH-mtxHuMW56hzi8uTYr_8N_Y-dMMG3U996so_YBwEdDW1LYSIzOBQ89cMA_Tod28s7GBIyE2viJ8mmbb3e_QAae_DM7o343gcJI9uIEHwGxjATzEPhkNpfQC0OsQEu3hWfGlNH-H8LZ8WT9fLp-amvHv8ddtc3pWWcZVKA5WsceWow5ZbuVatw7XgUsq6JoassaCEW9pWmBmCnRRKcMEwdpbgVgE7LW5nWufNVo-h25lw0N50-rXhw0abkDrbg-aSMqmcgxyVM6LmhNauIpnfrS2xmeti5hqDf5kgJr31Uxjy9ZoqoTAXGKu8xeYtG3yMAdp3VYL10S89-6WPfuk3vzLq-4zqsvoHRJWfoYT9B8N7kdM</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2969056009</pqid></control><display><type>article</type><title>Exploring Multi-Reader Buffers and Channel Placement during Dataflow Network Mapping to Heterogeneous Many-core Systems</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Letras, Martin ; Falk, Joachim ; Teich, Jurgen</creator><creatorcontrib>Letras, Martin ; Falk, Joachim ; Teich, Jurgen</creatorcontrib><description>This paper presents an approach for reducing the memory requirements of periodically executed dataflow applications, while minimizing the period when deployed on a many-core target. Often, implementations of dataflow applications suffer from data duplication if identical data has to be processed by multiple actors. In fact, multi-cast (also called fork) actors can produce huge memory overheads when storing and communicating copies of the same data. As a remedy, so-called Multi-Reader Buffers (MRBs) can be utilized to forward identical data to multiple actors in a First In First Out (FIFO) manner while storing each data item only once by sharing. However, using MRBs may increase the achievable period due to contention when accessing the shared data. This paper proposes a novel multi-objective design space exploration approach that selectively replaces multi-cast actors with MRBs and explores actor and FIFO channel mappings to find trade-offs between the objectives of period, memory footprint, and core cost. In distinction to the state-of-the-art, our approach considers (i) memory-size constraints for on-chip memories, (ii) hierarchical memories to implement the buffers, e.g., tile-local memories, (iii) supports heterogeneous many-core platforms, i.e., core-type dependent actor execution times, and (iv) optimizes the buffer placement and overall scheduling to minimize the execution period by proposing a novel combined actor and communication scheduling heuristic for period minimization called Communication-Aware Periodic Scheduling on Heterogeneous Many-core Systems (CAPS-HMS) . Our results show that the explored Pareto fronts improve a hypervolume indicator over a reference approach by up to 66% for small to mid-size applications and 90% for large applications. Moreover, selectively replacing multi-cast actors with corresponding MRBs proves to be always superior to never or always replacing them. Finally, it is shown that the quality of the explored Pareto fronts does not degrade when replacing the efficient scheduling heuristic CAPS-HMS by an exact Integer Linear Program (ILP) solver that requires orders of magnitude higher solver times and thus cannot be applied to large scale dataflow network problems.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2024.3375079</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Buffers ; Communication ; Dataflow Networks ; FIFO ; Firing ; Heuristic ; Image processing ; Integer programming ; Many-Core Systems ; Mapping ; Memory Management ; Minimization ; Modulo Scheduling ; Pareto Optimization ; Placement ; Processor scheduling ; Schedules ; Scheduling ; Solvers ; Space exploration ; Throughput</subject><ispartof>IEEE access, 2024-01, Vol.12, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c359t-ae47804d2d0c5c7b9fd0865777881a1b06215c2f403a10d769656300dc10f9e3</cites><orcidid>0000-0001-6285-5862 ; 0000-0002-1429-8982 ; 0009-0006-0834-3237</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10463021$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2100,27624,27915,27916,54924</link.rule.ids></links><search><creatorcontrib>Letras, Martin</creatorcontrib><creatorcontrib>Falk, Joachim</creatorcontrib><creatorcontrib>Teich, Jurgen</creatorcontrib><title>Exploring Multi-Reader Buffers and Channel Placement during Dataflow Network Mapping to Heterogeneous Many-core Systems</title><title>IEEE access</title><addtitle>Access</addtitle><description>This paper presents an approach for reducing the memory requirements of periodically executed dataflow applications, while minimizing the period when deployed on a many-core target. Often, implementations of dataflow applications suffer from data duplication if identical data has to be processed by multiple actors. In fact, multi-cast (also called fork) actors can produce huge memory overheads when storing and communicating copies of the same data. As a remedy, so-called Multi-Reader Buffers (MRBs) can be utilized to forward identical data to multiple actors in a First In First Out (FIFO) manner while storing each data item only once by sharing. However, using MRBs may increase the achievable period due to contention when accessing the shared data. This paper proposes a novel multi-objective design space exploration approach that selectively replaces multi-cast actors with MRBs and explores actor and FIFO channel mappings to find trade-offs between the objectives of period, memory footprint, and core cost. In distinction to the state-of-the-art, our approach considers (i) memory-size constraints for on-chip memories, (ii) hierarchical memories to implement the buffers, e.g., tile-local memories, (iii) supports heterogeneous many-core platforms, i.e., core-type dependent actor execution times, and (iv) optimizes the buffer placement and overall scheduling to minimize the execution period by proposing a novel combined actor and communication scheduling heuristic for period minimization called Communication-Aware Periodic Scheduling on Heterogeneous Many-core Systems (CAPS-HMS) . Our results show that the explored Pareto fronts improve a hypervolume indicator over a reference approach by up to 66% for small to mid-size applications and 90% for large applications. Moreover, selectively replacing multi-cast actors with corresponding MRBs proves to be always superior to never or always replacing them. Finally, it is shown that the quality of the explored Pareto fronts does not degrade when replacing the efficient scheduling heuristic CAPS-HMS by an exact Integer Linear Program (ILP) solver that requires orders of magnitude higher solver times and thus cannot be applied to large scale dataflow network problems.</description><subject>Buffers</subject><subject>Communication</subject><subject>Dataflow Networks</subject><subject>FIFO</subject><subject>Firing</subject><subject>Heuristic</subject><subject>Image processing</subject><subject>Integer programming</subject><subject>Many-Core Systems</subject><subject>Mapping</subject><subject>Memory Management</subject><subject>Minimization</subject><subject>Modulo Scheduling</subject><subject>Pareto Optimization</subject><subject>Placement</subject><subject>Processor scheduling</subject><subject>Schedules</subject><subject>Scheduling</subject><subject>Solvers</subject><subject>Space exploration</subject><subject>Throughput</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUU1v2zAMNYYNWNH2F2wHAT0704clWcfWS9cC_RiW3gVFojOnjuVKMrL8-yl1UZQXEiTfeyReUXwjeEEIVj8um2a5Wi0optWCMcmxVJ-KE0qEKhln4vOH-mtxHuMW56hzi8uTYr_8N_Y-dMMG3U996so_YBwEdDW1LYSIzOBQ89cMA_Tod28s7GBIyE2viJ8mmbb3e_QAae_DM7o343gcJI9uIEHwGxjATzEPhkNpfQC0OsQEu3hWfGlNH-H8LZ8WT9fLp-amvHv8ddtc3pWWcZVKA5WsceWow5ZbuVatw7XgUsq6JoassaCEW9pWmBmCnRRKcMEwdpbgVgE7LW5nWufNVo-h25lw0N50-rXhw0abkDrbg-aSMqmcgxyVM6LmhNauIpnfrS2xmeti5hqDf5kgJr31Uxjy9ZoqoTAXGKu8xeYtG3yMAdp3VYL10S89-6WPfuk3vzLq-4zqsvoHRJWfoYT9B8N7kdM</recordid><startdate>20240101</startdate><enddate>20240101</enddate><creator>Letras, Martin</creator><creator>Falk, Joachim</creator><creator>Teich, Jurgen</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-6285-5862</orcidid><orcidid>https://orcid.org/0000-0002-1429-8982</orcidid><orcidid>https://orcid.org/0009-0006-0834-3237</orcidid></search><sort><creationdate>20240101</creationdate><title>Exploring Multi-Reader Buffers and Channel Placement during Dataflow Network Mapping to Heterogeneous Many-core Systems</title><author>Letras, Martin ; Falk, Joachim ; Teich, Jurgen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-ae47804d2d0c5c7b9fd0865777881a1b06215c2f403a10d769656300dc10f9e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Buffers</topic><topic>Communication</topic><topic>Dataflow Networks</topic><topic>FIFO</topic><topic>Firing</topic><topic>Heuristic</topic><topic>Image processing</topic><topic>Integer programming</topic><topic>Many-Core Systems</topic><topic>Mapping</topic><topic>Memory Management</topic><topic>Minimization</topic><topic>Modulo Scheduling</topic><topic>Pareto Optimization</topic><topic>Placement</topic><topic>Processor scheduling</topic><topic>Schedules</topic><topic>Scheduling</topic><topic>Solvers</topic><topic>Space exploration</topic><topic>Throughput</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Letras, Martin</creatorcontrib><creatorcontrib>Falk, Joachim</creatorcontrib><creatorcontrib>Teich, Jurgen</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Letras, Martin</au><au>Falk, Joachim</au><au>Teich, Jurgen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Exploring Multi-Reader Buffers and Channel Placement during Dataflow Network Mapping to Heterogeneous Many-core Systems</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2024-01-01</date><risdate>2024</risdate><volume>12</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>This paper presents an approach for reducing the memory requirements of periodically executed dataflow applications, while minimizing the period when deployed on a many-core target. Often, implementations of dataflow applications suffer from data duplication if identical data has to be processed by multiple actors. In fact, multi-cast (also called fork) actors can produce huge memory overheads when storing and communicating copies of the same data. As a remedy, so-called Multi-Reader Buffers (MRBs) can be utilized to forward identical data to multiple actors in a First In First Out (FIFO) manner while storing each data item only once by sharing. However, using MRBs may increase the achievable period due to contention when accessing the shared data. This paper proposes a novel multi-objective design space exploration approach that selectively replaces multi-cast actors with MRBs and explores actor and FIFO channel mappings to find trade-offs between the objectives of period, memory footprint, and core cost. In distinction to the state-of-the-art, our approach considers (i) memory-size constraints for on-chip memories, (ii) hierarchical memories to implement the buffers, e.g., tile-local memories, (iii) supports heterogeneous many-core platforms, i.e., core-type dependent actor execution times, and (iv) optimizes the buffer placement and overall scheduling to minimize the execution period by proposing a novel combined actor and communication scheduling heuristic for period minimization called Communication-Aware Periodic Scheduling on Heterogeneous Many-core Systems (CAPS-HMS) . Our results show that the explored Pareto fronts improve a hypervolume indicator over a reference approach by up to 66% for small to mid-size applications and 90% for large applications. Moreover, selectively replacing multi-cast actors with corresponding MRBs proves to be always superior to never or always replacing them. Finally, it is shown that the quality of the explored Pareto fronts does not degrade when replacing the efficient scheduling heuristic CAPS-HMS by an exact Integer Linear Program (ILP) solver that requires orders of magnitude higher solver times and thus cannot be applied to large scale dataflow network problems.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2024.3375079</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-6285-5862</orcidid><orcidid>https://orcid.org/0000-0002-1429-8982</orcidid><orcidid>https://orcid.org/0009-0006-0834-3237</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2024-01, Vol.12, p.1-1
issn 2169-3536
2169-3536
language eng
recordid cdi_proquest_journals_2969056009
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Buffers
Communication
Dataflow Networks
FIFO
Firing
Heuristic
Image processing
Integer programming
Many-Core Systems
Mapping
Memory Management
Minimization
Modulo Scheduling
Pareto Optimization
Placement
Processor scheduling
Schedules
Scheduling
Solvers
Space exploration
Throughput
title Exploring Multi-Reader Buffers and Channel Placement during Dataflow Network Mapping to Heterogeneous Many-core Systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T04%3A23%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Exploring%20Multi-Reader%20Buffers%20and%20Channel%20Placement%20during%20Dataflow%20Network%20Mapping%20to%20Heterogeneous%20Many-core%20Systems&rft.jtitle=IEEE%20access&rft.au=Letras,%20Martin&rft.date=2024-01-01&rft.volume=12&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2024.3375079&rft_dat=%3Cproquest_cross%3E2969056009%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2969056009&rft_id=info:pmid/&rft_ieee_id=10463021&rft_doaj_id=oai_doaj_org_article_572379ddeeee4da685128d4115cdbc1c&rfr_iscdi=true