CIFER: A Cache-Coherent 12-nm 16-mm2 SoC With Four 64-Bit RISC-V Application Cores, 18 32-Bit RISC-V Compute Cores, and a 1541 LUT6/mm2 Synthesizable eFPGA

This letter presents CIFER, the world’s first open-source, fully cache-coherent, heterogeneous many-core, CPU-FPGA system-on-chips. The 12 nm, 16-mm2 chip integrates four 64-bit, OS-capable, RISC-V application cores; three TinyCore clusters that each contain six 32-bit, RISC-V compute cores (18 in t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE solid-state circuits letters 2023-01, Vol.6, p.229
Hauptverfasser: Ang, Li, Ting-Jung, Chang, Gao, Fei, Ta, Tuan, Tziantzioulis, Georgios, Ou, Yanghui, Wang, Moyang, Tu, Jinzheng, Xu, Kaifeng, Jackson, Paul, August, Ning, Chirkov, Grigory, Orenes-Vera, Marcelo, Agwa, Shady, Yan, Xiaoyu, Tang, Eric, Balkind, Jonathan, Batten, Christopher, Wentzlaff, David
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page 229
container_title IEEE solid-state circuits letters
container_volume 6
creator Ang, Li
Ting-Jung, Chang
Gao, Fei
Ta, Tuan
Tziantzioulis, Georgios
Ou, Yanghui
Wang, Moyang
Tu, Jinzheng
Xu, Kaifeng
Jackson, Paul
August, Ning
Chirkov, Grigory
Orenes-Vera, Marcelo
Agwa, Shady
Yan, Xiaoyu
Tang, Eric
Balkind, Jonathan
Batten, Christopher
Wentzlaff, David
description This letter presents CIFER, the world’s first open-source, fully cache-coherent, heterogeneous many-core, CPU-FPGA system-on-chips. The 12 nm, 16-mm2 chip integrates four 64-bit, OS-capable, RISC-V application cores; three TinyCore clusters that each contain six 32-bit, RISC-V compute cores (18 in total); and an electronic design automation-synthesized, standard-cell-based eFPGA. CIFER enables the decomposition of real-world applications and tailored execution (parallelization or specialization) per decomposed task. Our evaluation shows that: 1) the TinyCore clusters increase the throughput and energy efficiency of data- and thread-parallel tasks by up to [Formula Omitted] and [Formula Omitted] over one 64-bit core, respectively; 2) the eFPGA increases the throughput and energy efficiency of hardware-accelerable tasks by up to [Formula Omitted] and [Formula Omitted], respectively; and 3) using coherent caches for data transfer between the processors and the eFPGA increases the throughput and energy efficiency by up to [Formula Omitted] and [Formula Omitted], respectively.
doi_str_mv 10.1109/LSSC.2023.3303111
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2862649746</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2862649746</sourcerecordid><originalsourceid>FETCH-LOGICAL-p113t-299d724f5cab807297a6a2856b10d3b915358f0e3e74e6b3c19389b7dd3999ad3</originalsourceid><addsrcrecordid>eNpNj81Kw0AUhYMgWGofwN0Ft047d24yybiLQ1MLAaVpdVkmyZSkND8204W-ii9r8QdcncU5fB_H826QTxG5mqVZpqeCC5oScULEC28kgpCYkpyuvMkw7DnnqFASj0bep14m89U9xKBNUVmmu8oebesABWsbQMmaRkDWaXitXQVJdzqC9NlD7WC1zDR7gbjvD3VhXN21oLujHe4AIyDxf6O7pj85-9ebtgQDGPgI6WYtZ9-K99ZVdqg_TH6wYJPnRXztXe7MYbCT3xx7m2S-1o8sfVosdZyyHpEcE0qVofB3QWHyiIdChUYaEQUyR15SrjCgINpxSzb0rcypQEWRysOyJKWUKWns3f5w-2P3drKD2-7PN9uzcisiKaSvQl_SF24LYeI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2862649746</pqid></control><display><type>article</type><title>CIFER: A Cache-Coherent 12-nm 16-mm2 SoC With Four 64-Bit RISC-V Application Cores, 18 32-Bit RISC-V Compute Cores, and a 1541 LUT6/mm2 Synthesizable eFPGA</title><source>IEEE Electronic Library (IEL)</source><creator>Ang, Li ; Ting-Jung, Chang ; Gao, Fei ; Ta, Tuan ; Tziantzioulis, Georgios ; Ou, Yanghui ; Wang, Moyang ; Tu, Jinzheng ; Xu, Kaifeng ; Jackson, Paul ; August, Ning ; Chirkov, Grigory ; Orenes-Vera, Marcelo ; Agwa, Shady ; Yan, Xiaoyu ; Tang, Eric ; Balkind, Jonathan ; Batten, Christopher ; Wentzlaff, David</creator><creatorcontrib>Ang, Li ; Ting-Jung, Chang ; Gao, Fei ; Ta, Tuan ; Tziantzioulis, Georgios ; Ou, Yanghui ; Wang, Moyang ; Tu, Jinzheng ; Xu, Kaifeng ; Jackson, Paul ; August, Ning ; Chirkov, Grigory ; Orenes-Vera, Marcelo ; Agwa, Shady ; Yan, Xiaoyu ; Tang, Eric ; Balkind, Jonathan ; Batten, Christopher ; Wentzlaff, David</creatorcontrib><description>This letter presents CIFER, the world’s first open-source, fully cache-coherent, heterogeneous many-core, CPU-FPGA system-on-chips. The 12 nm, 16-mm2 chip integrates four 64-bit, OS-capable, RISC-V application cores; three TinyCore clusters that each contain six 32-bit, RISC-V compute cores (18 in total); and an electronic design automation-synthesized, standard-cell-based eFPGA. CIFER enables the decomposition of real-world applications and tailored execution (parallelization or specialization) per decomposed task. Our evaluation shows that: 1) the TinyCore clusters increase the throughput and energy efficiency of data- and thread-parallel tasks by up to [Formula Omitted] and [Formula Omitted] over one 64-bit core, respectively; 2) the eFPGA increases the throughput and energy efficiency of hardware-accelerable tasks by up to [Formula Omitted] and [Formula Omitted], respectively; and 3) using coherent caches for data transfer between the processors and the eFPGA increases the throughput and energy efficiency by up to [Formula Omitted] and [Formula Omitted], respectively.</description><identifier>EISSN: 2573-9603</identifier><identifier>DOI: 10.1109/LSSC.2023.3303111</identifier><language>eng</language><publisher>Piscataway: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</publisher><subject>Clusters ; Coherence ; Data transfer (computers) ; Decomposition ; Design standards ; Electronic design automation ; Synthesis ; System on chip</subject><ispartof>IEEE solid-state circuits letters, 2023-01, Vol.6, p.229</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Ang, Li</creatorcontrib><creatorcontrib>Ting-Jung, Chang</creatorcontrib><creatorcontrib>Gao, Fei</creatorcontrib><creatorcontrib>Ta, Tuan</creatorcontrib><creatorcontrib>Tziantzioulis, Georgios</creatorcontrib><creatorcontrib>Ou, Yanghui</creatorcontrib><creatorcontrib>Wang, Moyang</creatorcontrib><creatorcontrib>Tu, Jinzheng</creatorcontrib><creatorcontrib>Xu, Kaifeng</creatorcontrib><creatorcontrib>Jackson, Paul</creatorcontrib><creatorcontrib>August, Ning</creatorcontrib><creatorcontrib>Chirkov, Grigory</creatorcontrib><creatorcontrib>Orenes-Vera, Marcelo</creatorcontrib><creatorcontrib>Agwa, Shady</creatorcontrib><creatorcontrib>Yan, Xiaoyu</creatorcontrib><creatorcontrib>Tang, Eric</creatorcontrib><creatorcontrib>Balkind, Jonathan</creatorcontrib><creatorcontrib>Batten, Christopher</creatorcontrib><creatorcontrib>Wentzlaff, David</creatorcontrib><title>CIFER: A Cache-Coherent 12-nm 16-mm2 SoC With Four 64-Bit RISC-V Application Cores, 18 32-Bit RISC-V Compute Cores, and a 1541 LUT6/mm2 Synthesizable eFPGA</title><title>IEEE solid-state circuits letters</title><description>This letter presents CIFER, the world’s first open-source, fully cache-coherent, heterogeneous many-core, CPU-FPGA system-on-chips. The 12 nm, 16-mm2 chip integrates four 64-bit, OS-capable, RISC-V application cores; three TinyCore clusters that each contain six 32-bit, RISC-V compute cores (18 in total); and an electronic design automation-synthesized, standard-cell-based eFPGA. CIFER enables the decomposition of real-world applications and tailored execution (parallelization or specialization) per decomposed task. Our evaluation shows that: 1) the TinyCore clusters increase the throughput and energy efficiency of data- and thread-parallel tasks by up to [Formula Omitted] and [Formula Omitted] over one 64-bit core, respectively; 2) the eFPGA increases the throughput and energy efficiency of hardware-accelerable tasks by up to [Formula Omitted] and [Formula Omitted], respectively; and 3) using coherent caches for data transfer between the processors and the eFPGA increases the throughput and energy efficiency by up to [Formula Omitted] and [Formula Omitted], respectively.</description><subject>Clusters</subject><subject>Coherence</subject><subject>Data transfer (computers)</subject><subject>Decomposition</subject><subject>Design standards</subject><subject>Electronic design automation</subject><subject>Synthesis</subject><subject>System on chip</subject><issn>2573-9603</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpNj81Kw0AUhYMgWGofwN0Ft047d24yybiLQ1MLAaVpdVkmyZSkND8204W-ii9r8QdcncU5fB_H826QTxG5mqVZpqeCC5oScULEC28kgpCYkpyuvMkw7DnnqFASj0bep14m89U9xKBNUVmmu8oebesABWsbQMmaRkDWaXitXQVJdzqC9NlD7WC1zDR7gbjvD3VhXN21oLujHe4AIyDxf6O7pj85-9ebtgQDGPgI6WYtZ9-K99ZVdqg_TH6wYJPnRXztXe7MYbCT3xx7m2S-1o8sfVosdZyyHpEcE0qVofB3QWHyiIdChUYaEQUyR15SrjCgINpxSzb0rcypQEWRysOyJKWUKWns3f5w-2P3drKD2-7PN9uzcisiKaSvQl_SF24LYeI</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Ang, Li</creator><creator>Ting-Jung, Chang</creator><creator>Gao, Fei</creator><creator>Ta, Tuan</creator><creator>Tziantzioulis, Georgios</creator><creator>Ou, Yanghui</creator><creator>Wang, Moyang</creator><creator>Tu, Jinzheng</creator><creator>Xu, Kaifeng</creator><creator>Jackson, Paul</creator><creator>August, Ning</creator><creator>Chirkov, Grigory</creator><creator>Orenes-Vera, Marcelo</creator><creator>Agwa, Shady</creator><creator>Yan, Xiaoyu</creator><creator>Tang, Eric</creator><creator>Balkind, Jonathan</creator><creator>Batten, Christopher</creator><creator>Wentzlaff, David</creator><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope></search><sort><creationdate>20230101</creationdate><title>CIFER: A Cache-Coherent 12-nm 16-mm2 SoC With Four 64-Bit RISC-V Application Cores, 18 32-Bit RISC-V Compute Cores, and a 1541 LUT6/mm2 Synthesizable eFPGA</title><author>Ang, Li ; Ting-Jung, Chang ; Gao, Fei ; Ta, Tuan ; Tziantzioulis, Georgios ; Ou, Yanghui ; Wang, Moyang ; Tu, Jinzheng ; Xu, Kaifeng ; Jackson, Paul ; August, Ning ; Chirkov, Grigory ; Orenes-Vera, Marcelo ; Agwa, Shady ; Yan, Xiaoyu ; Tang, Eric ; Balkind, Jonathan ; Batten, Christopher ; Wentzlaff, David</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p113t-299d724f5cab807297a6a2856b10d3b915358f0e3e74e6b3c19389b7dd3999ad3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Clusters</topic><topic>Coherence</topic><topic>Data transfer (computers)</topic><topic>Decomposition</topic><topic>Design standards</topic><topic>Electronic design automation</topic><topic>Synthesis</topic><topic>System on chip</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ang, Li</creatorcontrib><creatorcontrib>Ting-Jung, Chang</creatorcontrib><creatorcontrib>Gao, Fei</creatorcontrib><creatorcontrib>Ta, Tuan</creatorcontrib><creatorcontrib>Tziantzioulis, Georgios</creatorcontrib><creatorcontrib>Ou, Yanghui</creatorcontrib><creatorcontrib>Wang, Moyang</creatorcontrib><creatorcontrib>Tu, Jinzheng</creatorcontrib><creatorcontrib>Xu, Kaifeng</creatorcontrib><creatorcontrib>Jackson, Paul</creatorcontrib><creatorcontrib>August, Ning</creatorcontrib><creatorcontrib>Chirkov, Grigory</creatorcontrib><creatorcontrib>Orenes-Vera, Marcelo</creatorcontrib><creatorcontrib>Agwa, Shady</creatorcontrib><creatorcontrib>Yan, Xiaoyu</creatorcontrib><creatorcontrib>Tang, Eric</creatorcontrib><creatorcontrib>Balkind, Jonathan</creatorcontrib><creatorcontrib>Batten, Christopher</creatorcontrib><creatorcontrib>Wentzlaff, David</creatorcontrib><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE solid-state circuits letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ang, Li</au><au>Ting-Jung, Chang</au><au>Gao, Fei</au><au>Ta, Tuan</au><au>Tziantzioulis, Georgios</au><au>Ou, Yanghui</au><au>Wang, Moyang</au><au>Tu, Jinzheng</au><au>Xu, Kaifeng</au><au>Jackson, Paul</au><au>August, Ning</au><au>Chirkov, Grigory</au><au>Orenes-Vera, Marcelo</au><au>Agwa, Shady</au><au>Yan, Xiaoyu</au><au>Tang, Eric</au><au>Balkind, Jonathan</au><au>Batten, Christopher</au><au>Wentzlaff, David</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CIFER: A Cache-Coherent 12-nm 16-mm2 SoC With Four 64-Bit RISC-V Application Cores, 18 32-Bit RISC-V Compute Cores, and a 1541 LUT6/mm2 Synthesizable eFPGA</atitle><jtitle>IEEE solid-state circuits letters</jtitle><date>2023-01-01</date><risdate>2023</risdate><volume>6</volume><spage>229</spage><pages>229-</pages><eissn>2573-9603</eissn><abstract>This letter presents CIFER, the world’s first open-source, fully cache-coherent, heterogeneous many-core, CPU-FPGA system-on-chips. The 12 nm, 16-mm2 chip integrates four 64-bit, OS-capable, RISC-V application cores; three TinyCore clusters that each contain six 32-bit, RISC-V compute cores (18 in total); and an electronic design automation-synthesized, standard-cell-based eFPGA. CIFER enables the decomposition of real-world applications and tailored execution (parallelization or specialization) per decomposed task. Our evaluation shows that: 1) the TinyCore clusters increase the throughput and energy efficiency of data- and thread-parallel tasks by up to [Formula Omitted] and [Formula Omitted] over one 64-bit core, respectively; 2) the eFPGA increases the throughput and energy efficiency of hardware-accelerable tasks by up to [Formula Omitted] and [Formula Omitted], respectively; and 3) using coherent caches for data transfer between the processors and the eFPGA increases the throughput and energy efficiency by up to [Formula Omitted] and [Formula Omitted], respectively.</abstract><cop>Piscataway</cop><pub>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</pub><doi>10.1109/LSSC.2023.3303111</doi></addata></record>
fulltext fulltext
identifier EISSN: 2573-9603
ispartof IEEE solid-state circuits letters, 2023-01, Vol.6, p.229
issn 2573-9603
language eng
recordid cdi_proquest_journals_2862649746
source IEEE Electronic Library (IEL)
subjects Clusters
Coherence
Data transfer (computers)
Decomposition
Design standards
Electronic design automation
Synthesis
System on chip
title CIFER: A Cache-Coherent 12-nm 16-mm2 SoC With Four 64-Bit RISC-V Application Cores, 18 32-Bit RISC-V Compute Cores, and a 1541 LUT6/mm2 Synthesizable eFPGA
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T02%3A32%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CIFER:%20A%20Cache-Coherent%2012-nm%2016-mm2%20SoC%20With%20Four%2064-Bit%20RISC-V%20Application%20Cores,%2018%2032-Bit%20RISC-V%20Compute%20Cores,%20and%20a%201541%20LUT6/mm2%20Synthesizable%20eFPGA&rft.jtitle=IEEE%20solid-state%20circuits%20letters&rft.au=Ang,%20Li&rft.date=2023-01-01&rft.volume=6&rft.spage=229&rft.pages=229-&rft.eissn=2573-9603&rft_id=info:doi/10.1109/LSSC.2023.3303111&rft_dat=%3Cproquest%3E2862649746%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2862649746&rft_id=info:pmid/&rfr_iscdi=true