Optimizing Floating Point Units in Hybrid FPGAs

This paper introduces a methodology to optimize coarse-grained floating point units (FPUs) in a hybrid field-programmable gate array (FPGA), where the FPU consists of a number of interconnected floating point adders/subtracters (FAs), multipliers (FMs), and wordblocks (WBs). The wordblocks include r...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on very large scale integration (VLSI) systems 2012-07, Vol.20 (7), p.1295-1303
Hauptverfasser: ChiWai Yu, Smith, A. M., Luk, W., Leong, P. H. W., Wilton, S. J. E.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1303
container_issue 7
container_start_page 1295
container_title IEEE transactions on very large scale integration (VLSI) systems
container_volume 20
creator ChiWai Yu
Smith, A. M.
Luk, W.
Leong, P. H. W.
Wilton, S. J. E.
description This paper introduces a methodology to optimize coarse-grained floating point units (FPUs) in a hybrid field-programmable gate array (FPGA), where the FPU consists of a number of interconnected floating point adders/subtracters (FAs), multipliers (FMs), and wordblocks (WBs). The wordblocks include registers and lookup tables (LUTs) which can implement fixed point operations efficiently. We employ common subgraph extraction to determine the best mix of blocks within an FPU and study the area, speed and utilization tradeoff over a set of floating point benchmark circuits. We then explore the system impact of FPU density and flexibility in terms of area, speed, and routing resources. Finally, we derive an optimized coarse-grained FPU by considering both architectural and system-level issues. This proposed methodology can be used to evaluate a variety of FPU architecture optimizations. The results for the selected FPU architecture optimization show that although high density FPUs are slower, they have the advantages of improved area, area-delay product, and throughput.
doi_str_mv 10.1109/TVLSI.2011.2153883
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_5893965</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5893965</ieee_id><sourcerecordid>2680918711</sourcerecordid><originalsourceid>FETCH-LOGICAL-c353t-d0e65c274bb29d517ce2c0ac78c306e8b7df2f875474ed20991a5a23320ac14e3</originalsourceid><addsrcrecordid>eNpd0N9LwzAQB_AgCs7pP6AvBRF86XaXNG3yOIb7AYMN3HwtaZpKRtfOpHuYf72ZGz6Ylxzc547jS8gjwgAR5HD9sXifDyggDihyJgS7Ij3kPItleNehhpTFgiLckjvvtwCYJBJ6ZLjcd3Znv23zGU3qVnWnYtXapos2je18ZJtodiycLaPJajry9-SmUrU3D5e_TzaTt_V4Fi-W0_l4tIg146yLSzAp1zRLioLKkmOmDdWgdCY0g9SIIisrWomMJ1liSgpSouKKMkYDwsSwPnk979279utgfJfvrNemrlVj2oPPERgGLDgL9Pkf3bYH14TrgkIJQgAkQdGz0q713pkq3zu7U-4YUH7KMP_NMD9lmF8yDEMvl9XKa1VXTjXa-r9JmobDORfBPZ2dNcb8tbmQTKac_QAcineb</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1019088004</pqid></control><display><type>article</type><title>Optimizing Floating Point Units in Hybrid FPGAs</title><source>IEEE Electronic Library (IEL)</source><creator>ChiWai Yu ; Smith, A. M. ; Luk, W. ; Leong, P. H. W. ; Wilton, S. J. E.</creator><creatorcontrib>ChiWai Yu ; Smith, A. M. ; Luk, W. ; Leong, P. H. W. ; Wilton, S. J. E.</creatorcontrib><description>This paper introduces a methodology to optimize coarse-grained floating point units (FPUs) in a hybrid field-programmable gate array (FPGA), where the FPU consists of a number of interconnected floating point adders/subtracters (FAs), multipliers (FMs), and wordblocks (WBs). The wordblocks include registers and lookup tables (LUTs) which can implement fixed point operations efficiently. We employ common subgraph extraction to determine the best mix of blocks within an FPU and study the area, speed and utilization tradeoff over a set of floating point benchmark circuits. We then explore the system impact of FPU density and flexibility in terms of area, speed, and routing resources. Finally, we derive an optimized coarse-grained FPU by considering both architectural and system-level issues. This proposed methodology can be used to evaluate a variety of FPU architecture optimizations. The results for the selected FPU architecture optimization show that although high density FPUs are slower, they have the advantages of improved area, area-delay product, and throughput.</description><identifier>ISSN: 1063-8210</identifier><identifier>EISSN: 1557-9999</identifier><identifier>DOI: 10.1109/TVLSI.2011.2153883</identifier><identifier>CODEN: IEVSE9</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Applied sciences ; Architecture ; Benchmark testing ; Circuit properties ; Common subgraph extraction ; Computer architecture ; Density ; Design. Technologies. Operation analysis. Testing ; Digital circuits ; Electric, optical and optoelectronic circuits ; Electronic circuits ; Electronics ; Exact sciences and technology ; Field programmable gate arrays ; field-programmable gate array (FPGA) ; floating point (FP) ; Floating point arithmetic ; High density ; Integrated circuits ; Lookup tables ; Merging ; Methodology ; Optimization ; Routing ; Semiconductor electronics. Microelectronics. Optoelectronics. Solid state devices ; Studies ; Very large scale integration ; Wires</subject><ispartof>IEEE transactions on very large scale integration (VLSI) systems, 2012-07, Vol.20 (7), p.1295-1303</ispartof><rights>2015 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jul 2012</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c353t-d0e65c274bb29d517ce2c0ac78c306e8b7df2f875474ed20991a5a23320ac14e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5893965$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,781,785,797,27929,27930,54763</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5893965$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=26099558$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>ChiWai Yu</creatorcontrib><creatorcontrib>Smith, A. M.</creatorcontrib><creatorcontrib>Luk, W.</creatorcontrib><creatorcontrib>Leong, P. H. W.</creatorcontrib><creatorcontrib>Wilton, S. J. E.</creatorcontrib><title>Optimizing Floating Point Units in Hybrid FPGAs</title><title>IEEE transactions on very large scale integration (VLSI) systems</title><addtitle>TVLSI</addtitle><description>This paper introduces a methodology to optimize coarse-grained floating point units (FPUs) in a hybrid field-programmable gate array (FPGA), where the FPU consists of a number of interconnected floating point adders/subtracters (FAs), multipliers (FMs), and wordblocks (WBs). The wordblocks include registers and lookup tables (LUTs) which can implement fixed point operations efficiently. We employ common subgraph extraction to determine the best mix of blocks within an FPU and study the area, speed and utilization tradeoff over a set of floating point benchmark circuits. We then explore the system impact of FPU density and flexibility in terms of area, speed, and routing resources. Finally, we derive an optimized coarse-grained FPU by considering both architectural and system-level issues. This proposed methodology can be used to evaluate a variety of FPU architecture optimizations. The results for the selected FPU architecture optimization show that although high density FPUs are slower, they have the advantages of improved area, area-delay product, and throughput.</description><subject>Applied sciences</subject><subject>Architecture</subject><subject>Benchmark testing</subject><subject>Circuit properties</subject><subject>Common subgraph extraction</subject><subject>Computer architecture</subject><subject>Density</subject><subject>Design. Technologies. Operation analysis. Testing</subject><subject>Digital circuits</subject><subject>Electric, optical and optoelectronic circuits</subject><subject>Electronic circuits</subject><subject>Electronics</subject><subject>Exact sciences and technology</subject><subject>Field programmable gate arrays</subject><subject>field-programmable gate array (FPGA)</subject><subject>floating point (FP)</subject><subject>Floating point arithmetic</subject><subject>High density</subject><subject>Integrated circuits</subject><subject>Lookup tables</subject><subject>Merging</subject><subject>Methodology</subject><subject>Optimization</subject><subject>Routing</subject><subject>Semiconductor electronics. Microelectronics. Optoelectronics. Solid state devices</subject><subject>Studies</subject><subject>Very large scale integration</subject><subject>Wires</subject><issn>1063-8210</issn><issn>1557-9999</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpd0N9LwzAQB_AgCs7pP6AvBRF86XaXNG3yOIb7AYMN3HwtaZpKRtfOpHuYf72ZGz6Ylxzc547jS8gjwgAR5HD9sXifDyggDihyJgS7Ij3kPItleNehhpTFgiLckjvvtwCYJBJ6ZLjcd3Znv23zGU3qVnWnYtXapos2je18ZJtodiycLaPJajry9-SmUrU3D5e_TzaTt_V4Fi-W0_l4tIg146yLSzAp1zRLioLKkmOmDdWgdCY0g9SIIisrWomMJ1liSgpSouKKMkYDwsSwPnk979279utgfJfvrNemrlVj2oPPERgGLDgL9Pkf3bYH14TrgkIJQgAkQdGz0q713pkq3zu7U-4YUH7KMP_NMD9lmF8yDEMvl9XKa1VXTjXa-r9JmobDORfBPZ2dNcb8tbmQTKac_QAcineb</recordid><startdate>20120701</startdate><enddate>20120701</enddate><creator>ChiWai Yu</creator><creator>Smith, A. M.</creator><creator>Luk, W.</creator><creator>Leong, P. H. W.</creator><creator>Wilton, S. J. E.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20120701</creationdate><title>Optimizing Floating Point Units in Hybrid FPGAs</title><author>ChiWai Yu ; Smith, A. M. ; Luk, W. ; Leong, P. H. W. ; Wilton, S. J. E.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c353t-d0e65c274bb29d517ce2c0ac78c306e8b7df2f875474ed20991a5a23320ac14e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Applied sciences</topic><topic>Architecture</topic><topic>Benchmark testing</topic><topic>Circuit properties</topic><topic>Common subgraph extraction</topic><topic>Computer architecture</topic><topic>Density</topic><topic>Design. Technologies. Operation analysis. Testing</topic><topic>Digital circuits</topic><topic>Electric, optical and optoelectronic circuits</topic><topic>Electronic circuits</topic><topic>Electronics</topic><topic>Exact sciences and technology</topic><topic>Field programmable gate arrays</topic><topic>field-programmable gate array (FPGA)</topic><topic>floating point (FP)</topic><topic>Floating point arithmetic</topic><topic>High density</topic><topic>Integrated circuits</topic><topic>Lookup tables</topic><topic>Merging</topic><topic>Methodology</topic><topic>Optimization</topic><topic>Routing</topic><topic>Semiconductor electronics. Microelectronics. Optoelectronics. Solid state devices</topic><topic>Studies</topic><topic>Very large scale integration</topic><topic>Wires</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>ChiWai Yu</creatorcontrib><creatorcontrib>Smith, A. M.</creatorcontrib><creatorcontrib>Luk, W.</creatorcontrib><creatorcontrib>Leong, P. H. W.</creatorcontrib><creatorcontrib>Wilton, S. J. E.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on very large scale integration (VLSI) systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>ChiWai Yu</au><au>Smith, A. M.</au><au>Luk, W.</au><au>Leong, P. H. W.</au><au>Wilton, S. J. E.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimizing Floating Point Units in Hybrid FPGAs</atitle><jtitle>IEEE transactions on very large scale integration (VLSI) systems</jtitle><stitle>TVLSI</stitle><date>2012-07-01</date><risdate>2012</risdate><volume>20</volume><issue>7</issue><spage>1295</spage><epage>1303</epage><pages>1295-1303</pages><issn>1063-8210</issn><eissn>1557-9999</eissn><coden>IEVSE9</coden><abstract>This paper introduces a methodology to optimize coarse-grained floating point units (FPUs) in a hybrid field-programmable gate array (FPGA), where the FPU consists of a number of interconnected floating point adders/subtracters (FAs), multipliers (FMs), and wordblocks (WBs). The wordblocks include registers and lookup tables (LUTs) which can implement fixed point operations efficiently. We employ common subgraph extraction to determine the best mix of blocks within an FPU and study the area, speed and utilization tradeoff over a set of floating point benchmark circuits. We then explore the system impact of FPU density and flexibility in terms of area, speed, and routing resources. Finally, we derive an optimized coarse-grained FPU by considering both architectural and system-level issues. This proposed methodology can be used to evaluate a variety of FPU architecture optimizations. The results for the selected FPU architecture optimization show that although high density FPUs are slower, they have the advantages of improved area, area-delay product, and throughput.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TVLSI.2011.2153883</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1063-8210
ispartof IEEE transactions on very large scale integration (VLSI) systems, 2012-07, Vol.20 (7), p.1295-1303
issn 1063-8210
1557-9999
language eng
recordid cdi_ieee_primary_5893965
source IEEE Electronic Library (IEL)
subjects Applied sciences
Architecture
Benchmark testing
Circuit properties
Common subgraph extraction
Computer architecture
Density
Design. Technologies. Operation analysis. Testing
Digital circuits
Electric, optical and optoelectronic circuits
Electronic circuits
Electronics
Exact sciences and technology
Field programmable gate arrays
field-programmable gate array (FPGA)
floating point (FP)
Floating point arithmetic
High density
Integrated circuits
Lookup tables
Merging
Methodology
Optimization
Routing
Semiconductor electronics. Microelectronics. Optoelectronics. Solid state devices
Studies
Very large scale integration
Wires
title Optimizing Floating Point Units in Hybrid FPGAs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-15T11%3A52%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimizing%20Floating%20Point%20Units%20in%20Hybrid%20FPGAs&rft.jtitle=IEEE%20transactions%20on%20very%20large%20scale%20integration%20(VLSI)%20systems&rft.au=ChiWai%20Yu&rft.date=2012-07-01&rft.volume=20&rft.issue=7&rft.spage=1295&rft.epage=1303&rft.pages=1295-1303&rft.issn=1063-8210&rft.eissn=1557-9999&rft.coden=IEVSE9&rft_id=info:doi/10.1109/TVLSI.2011.2153883&rft_dat=%3Cproquest_RIE%3E2680918711%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1019088004&rft_id=info:pmid/&rft_ieee_id=5893965&rfr_iscdi=true