Optimizing Floating Point Units in Hybrid FPGAs
This paper introduces a methodology to optimize coarse-grained floating point units (FPUs) in a hybrid field-programmable gate array (FPGA), where the FPU consists of a number of interconnected floating point adders/subtracters (FAs), multipliers (FMs), and wordblocks (WBs). The wordblocks include r...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on very large scale integration (VLSI) systems 2012-07, Vol.20 (7), p.1295-1303 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1303 |
---|---|
container_issue | 7 |
container_start_page | 1295 |
container_title | IEEE transactions on very large scale integration (VLSI) systems |
container_volume | 20 |
creator | ChiWai Yu Smith, A. M. Luk, W. Leong, P. H. W. Wilton, S. J. E. |
description | This paper introduces a methodology to optimize coarse-grained floating point units (FPUs) in a hybrid field-programmable gate array (FPGA), where the FPU consists of a number of interconnected floating point adders/subtracters (FAs), multipliers (FMs), and wordblocks (WBs). The wordblocks include registers and lookup tables (LUTs) which can implement fixed point operations efficiently. We employ common subgraph extraction to determine the best mix of blocks within an FPU and study the area, speed and utilization tradeoff over a set of floating point benchmark circuits. We then explore the system impact of FPU density and flexibility in terms of area, speed, and routing resources. Finally, we derive an optimized coarse-grained FPU by considering both architectural and system-level issues. This proposed methodology can be used to evaluate a variety of FPU architecture optimizations. The results for the selected FPU architecture optimization show that although high density FPUs are slower, they have the advantages of improved area, area-delay product, and throughput. |
doi_str_mv | 10.1109/TVLSI.2011.2153883 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_5893965</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5893965</ieee_id><sourcerecordid>2680918711</sourcerecordid><originalsourceid>FETCH-LOGICAL-c353t-d0e65c274bb29d517ce2c0ac78c306e8b7df2f875474ed20991a5a23320ac14e3</originalsourceid><addsrcrecordid>eNpd0N9LwzAQB_AgCs7pP6AvBRF86XaXNG3yOIb7AYMN3HwtaZpKRtfOpHuYf72ZGz6Ylxzc547jS8gjwgAR5HD9sXifDyggDihyJgS7Ij3kPItleNehhpTFgiLckjvvtwCYJBJ6ZLjcd3Znv23zGU3qVnWnYtXapos2je18ZJtodiycLaPJajry9-SmUrU3D5e_TzaTt_V4Fi-W0_l4tIg146yLSzAp1zRLioLKkmOmDdWgdCY0g9SIIisrWomMJ1liSgpSouKKMkYDwsSwPnk979279utgfJfvrNemrlVj2oPPERgGLDgL9Pkf3bYH14TrgkIJQgAkQdGz0q713pkq3zu7U-4YUH7KMP_NMD9lmF8yDEMvl9XKa1VXTjXa-r9JmobDORfBPZ2dNcb8tbmQTKac_QAcineb</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1019088004</pqid></control><display><type>article</type><title>Optimizing Floating Point Units in Hybrid FPGAs</title><source>IEEE Electronic Library (IEL)</source><creator>ChiWai Yu ; Smith, A. M. ; Luk, W. ; Leong, P. H. W. ; Wilton, S. J. E.</creator><creatorcontrib>ChiWai Yu ; Smith, A. M. ; Luk, W. ; Leong, P. H. W. ; Wilton, S. J. E.</creatorcontrib><description>This paper introduces a methodology to optimize coarse-grained floating point units (FPUs) in a hybrid field-programmable gate array (FPGA), where the FPU consists of a number of interconnected floating point adders/subtracters (FAs), multipliers (FMs), and wordblocks (WBs). The wordblocks include registers and lookup tables (LUTs) which can implement fixed point operations efficiently. We employ common subgraph extraction to determine the best mix of blocks within an FPU and study the area, speed and utilization tradeoff over a set of floating point benchmark circuits. We then explore the system impact of FPU density and flexibility in terms of area, speed, and routing resources. Finally, we derive an optimized coarse-grained FPU by considering both architectural and system-level issues. This proposed methodology can be used to evaluate a variety of FPU architecture optimizations. The results for the selected FPU architecture optimization show that although high density FPUs are slower, they have the advantages of improved area, area-delay product, and throughput.</description><identifier>ISSN: 1063-8210</identifier><identifier>EISSN: 1557-9999</identifier><identifier>DOI: 10.1109/TVLSI.2011.2153883</identifier><identifier>CODEN: IEVSE9</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Applied sciences ; Architecture ; Benchmark testing ; Circuit properties ; Common subgraph extraction ; Computer architecture ; Density ; Design. Technologies. Operation analysis. Testing ; Digital circuits ; Electric, optical and optoelectronic circuits ; Electronic circuits ; Electronics ; Exact sciences and technology ; Field programmable gate arrays ; field-programmable gate array (FPGA) ; floating point (FP) ; Floating point arithmetic ; High density ; Integrated circuits ; Lookup tables ; Merging ; Methodology ; Optimization ; Routing ; Semiconductor electronics. Microelectronics. Optoelectronics. Solid state devices ; Studies ; Very large scale integration ; Wires</subject><ispartof>IEEE transactions on very large scale integration (VLSI) systems, 2012-07, Vol.20 (7), p.1295-1303</ispartof><rights>2015 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jul 2012</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c353t-d0e65c274bb29d517ce2c0ac78c306e8b7df2f875474ed20991a5a23320ac14e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5893965$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,781,785,797,27929,27930,54763</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5893965$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=26099558$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>ChiWai Yu</creatorcontrib><creatorcontrib>Smith, A. M.</creatorcontrib><creatorcontrib>Luk, W.</creatorcontrib><creatorcontrib>Leong, P. H. W.</creatorcontrib><creatorcontrib>Wilton, S. J. E.</creatorcontrib><title>Optimizing Floating Point Units in Hybrid FPGAs</title><title>IEEE transactions on very large scale integration (VLSI) systems</title><addtitle>TVLSI</addtitle><description>This paper introduces a methodology to optimize coarse-grained floating point units (FPUs) in a hybrid field-programmable gate array (FPGA), where the FPU consists of a number of interconnected floating point adders/subtracters (FAs), multipliers (FMs), and wordblocks (WBs). The wordblocks include registers and lookup tables (LUTs) which can implement fixed point operations efficiently. We employ common subgraph extraction to determine the best mix of blocks within an FPU and study the area, speed and utilization tradeoff over a set of floating point benchmark circuits. We then explore the system impact of FPU density and flexibility in terms of area, speed, and routing resources. Finally, we derive an optimized coarse-grained FPU by considering both architectural and system-level issues. This proposed methodology can be used to evaluate a variety of FPU architecture optimizations. The results for the selected FPU architecture optimization show that although high density FPUs are slower, they have the advantages of improved area, area-delay product, and throughput.</description><subject>Applied sciences</subject><subject>Architecture</subject><subject>Benchmark testing</subject><subject>Circuit properties</subject><subject>Common subgraph extraction</subject><subject>Computer architecture</subject><subject>Density</subject><subject>Design. Technologies. Operation analysis. Testing</subject><subject>Digital circuits</subject><subject>Electric, optical and optoelectronic circuits</subject><subject>Electronic circuits</subject><subject>Electronics</subject><subject>Exact sciences and technology</subject><subject>Field programmable gate arrays</subject><subject>field-programmable gate array (FPGA)</subject><subject>floating point (FP)</subject><subject>Floating point arithmetic</subject><subject>High density</subject><subject>Integrated circuits</subject><subject>Lookup tables</subject><subject>Merging</subject><subject>Methodology</subject><subject>Optimization</subject><subject>Routing</subject><subject>Semiconductor electronics. Microelectronics. Optoelectronics. Solid state devices</subject><subject>Studies</subject><subject>Very large scale integration</subject><subject>Wires</subject><issn>1063-8210</issn><issn>1557-9999</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpd0N9LwzAQB_AgCs7pP6AvBRF86XaXNG3yOIb7AYMN3HwtaZpKRtfOpHuYf72ZGz6Ylxzc547jS8gjwgAR5HD9sXifDyggDihyJgS7Ij3kPItleNehhpTFgiLckjvvtwCYJBJ6ZLjcd3Znv23zGU3qVnWnYtXapos2je18ZJtodiycLaPJajry9-SmUrU3D5e_TzaTt_V4Fi-W0_l4tIg146yLSzAp1zRLioLKkmOmDdWgdCY0g9SIIisrWomMJ1liSgpSouKKMkYDwsSwPnk979279utgfJfvrNemrlVj2oPPERgGLDgL9Pkf3bYH14TrgkIJQgAkQdGz0q713pkq3zu7U-4YUH7KMP_NMD9lmF8yDEMvl9XKa1VXTjXa-r9JmobDORfBPZ2dNcb8tbmQTKac_QAcineb</recordid><startdate>20120701</startdate><enddate>20120701</enddate><creator>ChiWai Yu</creator><creator>Smith, A. M.</creator><creator>Luk, W.</creator><creator>Leong, P. H. W.</creator><creator>Wilton, S. J. E.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20120701</creationdate><title>Optimizing Floating Point Units in Hybrid FPGAs</title><author>ChiWai Yu ; Smith, A. M. ; Luk, W. ; Leong, P. H. W. ; Wilton, S. J. E.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c353t-d0e65c274bb29d517ce2c0ac78c306e8b7df2f875474ed20991a5a23320ac14e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Applied sciences</topic><topic>Architecture</topic><topic>Benchmark testing</topic><topic>Circuit properties</topic><topic>Common subgraph extraction</topic><topic>Computer architecture</topic><topic>Density</topic><topic>Design. Technologies. Operation analysis. Testing</topic><topic>Digital circuits</topic><topic>Electric, optical and optoelectronic circuits</topic><topic>Electronic circuits</topic><topic>Electronics</topic><topic>Exact sciences and technology</topic><topic>Field programmable gate arrays</topic><topic>field-programmable gate array (FPGA)</topic><topic>floating point (FP)</topic><topic>Floating point arithmetic</topic><topic>High density</topic><topic>Integrated circuits</topic><topic>Lookup tables</topic><topic>Merging</topic><topic>Methodology</topic><topic>Optimization</topic><topic>Routing</topic><topic>Semiconductor electronics. Microelectronics. Optoelectronics. Solid state devices</topic><topic>Studies</topic><topic>Very large scale integration</topic><topic>Wires</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>ChiWai Yu</creatorcontrib><creatorcontrib>Smith, A. M.</creatorcontrib><creatorcontrib>Luk, W.</creatorcontrib><creatorcontrib>Leong, P. H. W.</creatorcontrib><creatorcontrib>Wilton, S. J. E.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on very large scale integration (VLSI) systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>ChiWai Yu</au><au>Smith, A. M.</au><au>Luk, W.</au><au>Leong, P. H. W.</au><au>Wilton, S. J. E.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimizing Floating Point Units in Hybrid FPGAs</atitle><jtitle>IEEE transactions on very large scale integration (VLSI) systems</jtitle><stitle>TVLSI</stitle><date>2012-07-01</date><risdate>2012</risdate><volume>20</volume><issue>7</issue><spage>1295</spage><epage>1303</epage><pages>1295-1303</pages><issn>1063-8210</issn><eissn>1557-9999</eissn><coden>IEVSE9</coden><abstract>This paper introduces a methodology to optimize coarse-grained floating point units (FPUs) in a hybrid field-programmable gate array (FPGA), where the FPU consists of a number of interconnected floating point adders/subtracters (FAs), multipliers (FMs), and wordblocks (WBs). The wordblocks include registers and lookup tables (LUTs) which can implement fixed point operations efficiently. We employ common subgraph extraction to determine the best mix of blocks within an FPU and study the area, speed and utilization tradeoff over a set of floating point benchmark circuits. We then explore the system impact of FPU density and flexibility in terms of area, speed, and routing resources. Finally, we derive an optimized coarse-grained FPU by considering both architectural and system-level issues. This proposed methodology can be used to evaluate a variety of FPU architecture optimizations. The results for the selected FPU architecture optimization show that although high density FPUs are slower, they have the advantages of improved area, area-delay product, and throughput.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TVLSI.2011.2153883</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1063-8210 |
ispartof | IEEE transactions on very large scale integration (VLSI) systems, 2012-07, Vol.20 (7), p.1295-1303 |
issn | 1063-8210 1557-9999 |
language | eng |
recordid | cdi_ieee_primary_5893965 |
source | IEEE Electronic Library (IEL) |
subjects | Applied sciences Architecture Benchmark testing Circuit properties Common subgraph extraction Computer architecture Density Design. Technologies. Operation analysis. Testing Digital circuits Electric, optical and optoelectronic circuits Electronic circuits Electronics Exact sciences and technology Field programmable gate arrays field-programmable gate array (FPGA) floating point (FP) Floating point arithmetic High density Integrated circuits Lookup tables Merging Methodology Optimization Routing Semiconductor electronics. Microelectronics. Optoelectronics. Solid state devices Studies Very large scale integration Wires |
title | Optimizing Floating Point Units in Hybrid FPGAs |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-15T11%3A52%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimizing%20Floating%20Point%20Units%20in%20Hybrid%20FPGAs&rft.jtitle=IEEE%20transactions%20on%20very%20large%20scale%20integration%20(VLSI)%20systems&rft.au=ChiWai%20Yu&rft.date=2012-07-01&rft.volume=20&rft.issue=7&rft.spage=1295&rft.epage=1303&rft.pages=1295-1303&rft.issn=1063-8210&rft.eissn=1557-9999&rft.coden=IEVSE9&rft_id=info:doi/10.1109/TVLSI.2011.2153883&rft_dat=%3Cproquest_RIE%3E2680918711%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1019088004&rft_id=info:pmid/&rft_ieee_id=5893965&rfr_iscdi=true |