Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device
Unlike traditional PCIe-based FPGA accelerators, heterogeneous SoC-FPGA devices provide tighter integrations between software running on CPUs and hardware accelerators. Modern heterogeneous SoC-FPGA platforms support multiple I/O cache coherence options between CPUs and FPGAs, but these options can...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Min, Seung Won Huang, Sitao El-Hadedy, Mohamed Xiong, Jinjun Chen, Deming Hwu, Wen-mei |
description | Unlike traditional PCIe-based FPGA accelerators, heterogeneous SoC-FPGA
devices provide tighter integrations between software running on CPUs and
hardware accelerators. Modern heterogeneous SoC-FPGA platforms support multiple
I/O cache coherence options between CPUs and FPGAs, but these options can have
inadvertent effects on the achieved bandwidths depending on applications and
data access patterns. To provide the most efficient communications between CPUs
and accelerators, understanding the data transaction behaviors and selecting
the right I/O cache coherence method is essential. In this paper, we use Xilinx
Zynq UltraScale+ as the SoC platform to show how certain I/O cache coherence
method can perform better or worse in different situations, ultimately
affecting the overall accelerator performances as well. Based on our analysis,
we further explore possible software and hardware modifications to improve the
I/O performances with different I/O cache coherence options. With our proposed
modifications, the overall performance of SoC design can be averagely improved
by 20%. |
doi_str_mv | 10.48550/arxiv.1908.01261 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1908_01261</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1908_01261</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-1c7c6a75d5e8ddf06a1ef71a37fbcd7a77e1041043345da559a0ff1fd4a75ed93</originalsourceid><addsrcrecordid>eNotz1FrwjAUBeC87GHofsCelj_Qmrs0TftYsukEoYK-l7vkRgPaSFpk3a-fc4MD5-kc-Bh7BpEXlVJigekrXHOoRZULeC3hkbVNj6dpCAPH3vH2MoZz-MYxxJ5Hz9eLlhu0R-ImHilRbye-GxOOdAg0cB8T30WTLberhr_RNViaswePp4Ge_nvG9sv3vfnINu1qbZpNhqWGDKy2JWrlFFXOeVEikNeAUvtP6zRqTSCKW6QslEOlahTeg3fFbUSuljP28nd7F3WXFM6Ypu5X1t1l8gcrb0ha</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device</title><source>arXiv.org</source><creator>Min, Seung Won ; Huang, Sitao ; El-Hadedy, Mohamed ; Xiong, Jinjun ; Chen, Deming ; Hwu, Wen-mei</creator><creatorcontrib>Min, Seung Won ; Huang, Sitao ; El-Hadedy, Mohamed ; Xiong, Jinjun ; Chen, Deming ; Hwu, Wen-mei</creatorcontrib><description>Unlike traditional PCIe-based FPGA accelerators, heterogeneous SoC-FPGA
devices provide tighter integrations between software running on CPUs and
hardware accelerators. Modern heterogeneous SoC-FPGA platforms support multiple
I/O cache coherence options between CPUs and FPGAs, but these options can have
inadvertent effects on the achieved bandwidths depending on applications and
data access patterns. To provide the most efficient communications between CPUs
and accelerators, understanding the data transaction behaviors and selecting
the right I/O cache coherence method is essential. In this paper, we use Xilinx
Zynq UltraScale+ as the SoC platform to show how certain I/O cache coherence
method can perform better or worse in different situations, ultimately
affecting the overall accelerator performances as well. Based on our analysis,
we further explore possible software and hardware modifications to improve the
I/O performances with different I/O cache coherence options. With our proposed
modifications, the overall performance of SoC design can be averagely improved
by 20%.</description><identifier>DOI: 10.48550/arxiv.1908.01261</identifier><language>eng</language><subject>Computer Science - Hardware Architecture</subject><creationdate>2019-08</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,781,886</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1908.01261$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1908.01261$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Min, Seung Won</creatorcontrib><creatorcontrib>Huang, Sitao</creatorcontrib><creatorcontrib>El-Hadedy, Mohamed</creatorcontrib><creatorcontrib>Xiong, Jinjun</creatorcontrib><creatorcontrib>Chen, Deming</creatorcontrib><creatorcontrib>Hwu, Wen-mei</creatorcontrib><title>Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device</title><description>Unlike traditional PCIe-based FPGA accelerators, heterogeneous SoC-FPGA
devices provide tighter integrations between software running on CPUs and
hardware accelerators. Modern heterogeneous SoC-FPGA platforms support multiple
I/O cache coherence options between CPUs and FPGAs, but these options can have
inadvertent effects on the achieved bandwidths depending on applications and
data access patterns. To provide the most efficient communications between CPUs
and accelerators, understanding the data transaction behaviors and selecting
the right I/O cache coherence method is essential. In this paper, we use Xilinx
Zynq UltraScale+ as the SoC platform to show how certain I/O cache coherence
method can perform better or worse in different situations, ultimately
affecting the overall accelerator performances as well. Based on our analysis,
we further explore possible software and hardware modifications to improve the
I/O performances with different I/O cache coherence options. With our proposed
modifications, the overall performance of SoC design can be averagely improved
by 20%.</description><subject>Computer Science - Hardware Architecture</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz1FrwjAUBeC87GHofsCelj_Qmrs0TftYsukEoYK-l7vkRgPaSFpk3a-fc4MD5-kc-Bh7BpEXlVJigekrXHOoRZULeC3hkbVNj6dpCAPH3vH2MoZz-MYxxJ5Hz9eLlhu0R-ImHilRbye-GxOOdAg0cB8T30WTLberhr_RNViaswePp4Ge_nvG9sv3vfnINu1qbZpNhqWGDKy2JWrlFFXOeVEikNeAUvtP6zRqTSCKW6QslEOlahTeg3fFbUSuljP28nd7F3WXFM6Ypu5X1t1l8gcrb0ha</recordid><startdate>20190803</startdate><enddate>20190803</enddate><creator>Min, Seung Won</creator><creator>Huang, Sitao</creator><creator>El-Hadedy, Mohamed</creator><creator>Xiong, Jinjun</creator><creator>Chen, Deming</creator><creator>Hwu, Wen-mei</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20190803</creationdate><title>Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device</title><author>Min, Seung Won ; Huang, Sitao ; El-Hadedy, Mohamed ; Xiong, Jinjun ; Chen, Deming ; Hwu, Wen-mei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-1c7c6a75d5e8ddf06a1ef71a37fbcd7a77e1041043345da559a0ff1fd4a75ed93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer Science - Hardware Architecture</topic><toplevel>online_resources</toplevel><creatorcontrib>Min, Seung Won</creatorcontrib><creatorcontrib>Huang, Sitao</creatorcontrib><creatorcontrib>El-Hadedy, Mohamed</creatorcontrib><creatorcontrib>Xiong, Jinjun</creatorcontrib><creatorcontrib>Chen, Deming</creatorcontrib><creatorcontrib>Hwu, Wen-mei</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Min, Seung Won</au><au>Huang, Sitao</au><au>El-Hadedy, Mohamed</au><au>Xiong, Jinjun</au><au>Chen, Deming</au><au>Hwu, Wen-mei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device</atitle><date>2019-08-03</date><risdate>2019</risdate><abstract>Unlike traditional PCIe-based FPGA accelerators, heterogeneous SoC-FPGA
devices provide tighter integrations between software running on CPUs and
hardware accelerators. Modern heterogeneous SoC-FPGA platforms support multiple
I/O cache coherence options between CPUs and FPGAs, but these options can have
inadvertent effects on the achieved bandwidths depending on applications and
data access patterns. To provide the most efficient communications between CPUs
and accelerators, understanding the data transaction behaviors and selecting
the right I/O cache coherence method is essential. In this paper, we use Xilinx
Zynq UltraScale+ as the SoC platform to show how certain I/O cache coherence
method can perform better or worse in different situations, ultimately
affecting the overall accelerator performances as well. Based on our analysis,
we further explore possible software and hardware modifications to improve the
I/O performances with different I/O cache coherence options. With our proposed
modifications, the overall performance of SoC design can be averagely improved
by 20%.</abstract><doi>10.48550/arxiv.1908.01261</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.1908.01261 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_1908_01261 |
source | arXiv.org |
subjects | Computer Science - Hardware Architecture |
title | Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-16T12%3A18%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Analysis%20and%20Optimization%20of%20I/O%20Cache%20Coherency%20Strategies%20for%20SoC-FPGA%20Device&rft.au=Min,%20Seung%20Won&rft.date=2019-08-03&rft_id=info:doi/10.48550/arxiv.1908.01261&rft_dat=%3Carxiv_GOX%3E1908_01261%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |