Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device

Unlike traditional PCIe-based FPGA accelerators, heterogeneous SoC-FPGA devices provide tighter integrations between software running on CPUs and hardware accelerators. Modern heterogeneous SoC-FPGA platforms support multiple I/O cache coherence options between CPUs and FPGAs, but these options can...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Min, Seung Won, Huang, Sitao, El-Hadedy, Mohamed, Xiong, Jinjun, Chen, Deming, Hwu, Wen-mei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Min, Seung Won
Huang, Sitao
El-Hadedy, Mohamed
Xiong, Jinjun
Chen, Deming
Hwu, Wen-mei
description Unlike traditional PCIe-based FPGA accelerators, heterogeneous SoC-FPGA devices provide tighter integrations between software running on CPUs and hardware accelerators. Modern heterogeneous SoC-FPGA platforms support multiple I/O cache coherence options between CPUs and FPGAs, but these options can have inadvertent effects on the achieved bandwidths depending on applications and data access patterns. To provide the most efficient communications between CPUs and accelerators, understanding the data transaction behaviors and selecting the right I/O cache coherence method is essential. In this paper, we use Xilinx Zynq UltraScale+ as the SoC platform to show how certain I/O cache coherence method can perform better or worse in different situations, ultimately affecting the overall accelerator performances as well. Based on our analysis, we further explore possible software and hardware modifications to improve the I/O performances with different I/O cache coherence options. With our proposed modifications, the overall performance of SoC design can be averagely improved by 20%.
doi_str_mv 10.48550/arxiv.1908.01261
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1908_01261</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1908_01261</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-1c7c6a75d5e8ddf06a1ef71a37fbcd7a77e1041043345da559a0ff1fd4a75ed93</originalsourceid><addsrcrecordid>eNotz1FrwjAUBeC87GHofsCelj_Qmrs0TftYsukEoYK-l7vkRgPaSFpk3a-fc4MD5-kc-Bh7BpEXlVJigekrXHOoRZULeC3hkbVNj6dpCAPH3vH2MoZz-MYxxJ5Hz9eLlhu0R-ImHilRbye-GxOOdAg0cB8T30WTLberhr_RNViaswePp4Ge_nvG9sv3vfnINu1qbZpNhqWGDKy2JWrlFFXOeVEikNeAUvtP6zRqTSCKW6QslEOlahTeg3fFbUSuljP28nd7F3WXFM6Ypu5X1t1l8gcrb0ha</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device</title><source>arXiv.org</source><creator>Min, Seung Won ; Huang, Sitao ; El-Hadedy, Mohamed ; Xiong, Jinjun ; Chen, Deming ; Hwu, Wen-mei</creator><creatorcontrib>Min, Seung Won ; Huang, Sitao ; El-Hadedy, Mohamed ; Xiong, Jinjun ; Chen, Deming ; Hwu, Wen-mei</creatorcontrib><description>Unlike traditional PCIe-based FPGA accelerators, heterogeneous SoC-FPGA devices provide tighter integrations between software running on CPUs and hardware accelerators. Modern heterogeneous SoC-FPGA platforms support multiple I/O cache coherence options between CPUs and FPGAs, but these options can have inadvertent effects on the achieved bandwidths depending on applications and data access patterns. To provide the most efficient communications between CPUs and accelerators, understanding the data transaction behaviors and selecting the right I/O cache coherence method is essential. In this paper, we use Xilinx Zynq UltraScale+ as the SoC platform to show how certain I/O cache coherence method can perform better or worse in different situations, ultimately affecting the overall accelerator performances as well. Based on our analysis, we further explore possible software and hardware modifications to improve the I/O performances with different I/O cache coherence options. With our proposed modifications, the overall performance of SoC design can be averagely improved by 20%.</description><identifier>DOI: 10.48550/arxiv.1908.01261</identifier><language>eng</language><subject>Computer Science - Hardware Architecture</subject><creationdate>2019-08</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,781,886</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1908.01261$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1908.01261$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Min, Seung Won</creatorcontrib><creatorcontrib>Huang, Sitao</creatorcontrib><creatorcontrib>El-Hadedy, Mohamed</creatorcontrib><creatorcontrib>Xiong, Jinjun</creatorcontrib><creatorcontrib>Chen, Deming</creatorcontrib><creatorcontrib>Hwu, Wen-mei</creatorcontrib><title>Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device</title><description>Unlike traditional PCIe-based FPGA accelerators, heterogeneous SoC-FPGA devices provide tighter integrations between software running on CPUs and hardware accelerators. Modern heterogeneous SoC-FPGA platforms support multiple I/O cache coherence options between CPUs and FPGAs, but these options can have inadvertent effects on the achieved bandwidths depending on applications and data access patterns. To provide the most efficient communications between CPUs and accelerators, understanding the data transaction behaviors and selecting the right I/O cache coherence method is essential. In this paper, we use Xilinx Zynq UltraScale+ as the SoC platform to show how certain I/O cache coherence method can perform better or worse in different situations, ultimately affecting the overall accelerator performances as well. Based on our analysis, we further explore possible software and hardware modifications to improve the I/O performances with different I/O cache coherence options. With our proposed modifications, the overall performance of SoC design can be averagely improved by 20%.</description><subject>Computer Science - Hardware Architecture</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz1FrwjAUBeC87GHofsCelj_Qmrs0TftYsukEoYK-l7vkRgPaSFpk3a-fc4MD5-kc-Bh7BpEXlVJigekrXHOoRZULeC3hkbVNj6dpCAPH3vH2MoZz-MYxxJ5Hz9eLlhu0R-ImHilRbye-GxOOdAg0cB8T30WTLberhr_RNViaswePp4Ge_nvG9sv3vfnINu1qbZpNhqWGDKy2JWrlFFXOeVEikNeAUvtP6zRqTSCKW6QslEOlahTeg3fFbUSuljP28nd7F3WXFM6Ypu5X1t1l8gcrb0ha</recordid><startdate>20190803</startdate><enddate>20190803</enddate><creator>Min, Seung Won</creator><creator>Huang, Sitao</creator><creator>El-Hadedy, Mohamed</creator><creator>Xiong, Jinjun</creator><creator>Chen, Deming</creator><creator>Hwu, Wen-mei</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20190803</creationdate><title>Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device</title><author>Min, Seung Won ; Huang, Sitao ; El-Hadedy, Mohamed ; Xiong, Jinjun ; Chen, Deming ; Hwu, Wen-mei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-1c7c6a75d5e8ddf06a1ef71a37fbcd7a77e1041043345da559a0ff1fd4a75ed93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer Science - Hardware Architecture</topic><toplevel>online_resources</toplevel><creatorcontrib>Min, Seung Won</creatorcontrib><creatorcontrib>Huang, Sitao</creatorcontrib><creatorcontrib>El-Hadedy, Mohamed</creatorcontrib><creatorcontrib>Xiong, Jinjun</creatorcontrib><creatorcontrib>Chen, Deming</creatorcontrib><creatorcontrib>Hwu, Wen-mei</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Min, Seung Won</au><au>Huang, Sitao</au><au>El-Hadedy, Mohamed</au><au>Xiong, Jinjun</au><au>Chen, Deming</au><au>Hwu, Wen-mei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device</atitle><date>2019-08-03</date><risdate>2019</risdate><abstract>Unlike traditional PCIe-based FPGA accelerators, heterogeneous SoC-FPGA devices provide tighter integrations between software running on CPUs and hardware accelerators. Modern heterogeneous SoC-FPGA platforms support multiple I/O cache coherence options between CPUs and FPGAs, but these options can have inadvertent effects on the achieved bandwidths depending on applications and data access patterns. To provide the most efficient communications between CPUs and accelerators, understanding the data transaction behaviors and selecting the right I/O cache coherence method is essential. In this paper, we use Xilinx Zynq UltraScale+ as the SoC platform to show how certain I/O cache coherence method can perform better or worse in different situations, ultimately affecting the overall accelerator performances as well. Based on our analysis, we further explore possible software and hardware modifications to improve the I/O performances with different I/O cache coherence options. With our proposed modifications, the overall performance of SoC design can be averagely improved by 20%.</abstract><doi>10.48550/arxiv.1908.01261</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.1908.01261
ispartof
issn
language eng
recordid cdi_arxiv_primary_1908_01261
source arXiv.org
subjects Computer Science - Hardware Architecture
title Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-16T12%3A18%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Analysis%20and%20Optimization%20of%20I/O%20Cache%20Coherency%20Strategies%20for%20SoC-FPGA%20Device&rft.au=Min,%20Seung%20Won&rft.date=2019-08-03&rft_id=info:doi/10.48550/arxiv.1908.01261&rft_dat=%3Carxiv_GOX%3E1908_01261%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true