Hardware Acceleration and Implementation of YOLOX-s for On-Orbit FPGA
The rapid development of remote sensing technology has brought about a sharp increase in the amount of remote sensing image data. However, due to the satellite’s limited hardware resources, space, and power consumption constraints, it is difficult to process massive remote sensing images efficiently...
Gespeichert in:
Veröffentlicht in: | Electronics (Basel) 2022-11, Vol.11 (21), p.3473 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 21 |
container_start_page | 3473 |
container_title | Electronics (Basel) |
container_volume | 11 |
creator | Wang, Ling Zhou, Hai Bian, Chunjiang Jiang, Kangning Cheng, Xiaolei |
description | The rapid development of remote sensing technology has brought about a sharp increase in the amount of remote sensing image data. However, due to the satellite’s limited hardware resources, space, and power consumption constraints, it is difficult to process massive remote sensing images efficiently and robustly using the traditional remote sensing image processing methods. Additionally, the task of satellite-to-ground target detection has higher requirements for speed and accuracy under the conditions of more and more remote sensing data. To solve these problems, this paper proposes an extremely efficient and reliable acceleration architecture for forward inference of the YOLOX-s detection network an on-orbit FPGA. Considering the limited onboard resources, the design strategy of the parallel loop unrolling of the input channels and output channels is adopted to build the largest DSP computing array to ensure a reliable and full utilization of the limited computing resources, thus reducing the inference delay of the entire network. Meanwhile, a three-path cache queue and a small-scale cascaded pooling array are designed, which maximize the reuse of on-chip cache data, effectively reduce the bandwidth bottleneck of the external memory, and ensure an efficient computing of the entire computing array. The experimental results show that at the 200 MHz operating frequency of the VC709, the overall inference performance of the FPGA acceleration can reach 399.62 GOPS, the peak performance can reach 408.4 GOPS, and the overall computing efficiency of the DSP array can reach 97.56%. Compared with the previous work, our architecture design further improves the computing efficiency under limited hardware resources. |
doi_str_mv | 10.3390/electronics11213473 |
format | Article |
fullrecord | <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_journals_2734621839</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A745598007</galeid><sourcerecordid>A745598007</sourcerecordid><originalsourceid>FETCH-LOGICAL-c361t-4205989aa4a8927145c8a119652d9bf0f49a275fb620e36e6172e62a70d0e50a3</originalsourceid><addsrcrecordid>eNptUE1LxDAQDaLgsu4v8FLw3DUfbdMcy7JfsFAPCnoqs-lEurTJmnQR_72RevDgzGGGx3tvhkfIPaNLIRR9xB716J3tdGCMM5FJcUVmnEqVKq749Z_9lixCONFYiolS0BlZ78C3n-AxqbSOTh7GztkEbJvsh3OPA9pxgpxJ3upD_ZqGxDif1Dat_bEbk83TtrojNwb6gIvfOScvm_Xzapce6u1-VR1SLQo2phmnuSoVQAal4pJluS6BMVXkvFVHQ02mgMvcHAtOURRYMMmx4CBpSzGnIObkYfI9e_dxwTA2J3fxNp5suBRZwVkpVGQtJ9Y79Nh01rjRg47d4tBpZ9F0Ea9klsdvKJVRICaB9i4Ej6Y5-24A_9Uw2vxk3PyTsfgGVGJvLQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2734621839</pqid></control><display><type>article</type><title>Hardware Acceleration and Implementation of YOLOX-s for On-Orbit FPGA</title><source>MDPI - Multidisciplinary Digital Publishing Institute</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Wang, Ling ; Zhou, Hai ; Bian, Chunjiang ; Jiang, Kangning ; Cheng, Xiaolei</creator><creatorcontrib>Wang, Ling ; Zhou, Hai ; Bian, Chunjiang ; Jiang, Kangning ; Cheng, Xiaolei</creatorcontrib><description>The rapid development of remote sensing technology has brought about a sharp increase in the amount of remote sensing image data. However, due to the satellite’s limited hardware resources, space, and power consumption constraints, it is difficult to process massive remote sensing images efficiently and robustly using the traditional remote sensing image processing methods. Additionally, the task of satellite-to-ground target detection has higher requirements for speed and accuracy under the conditions of more and more remote sensing data. To solve these problems, this paper proposes an extremely efficient and reliable acceleration architecture for forward inference of the YOLOX-s detection network an on-orbit FPGA. Considering the limited onboard resources, the design strategy of the parallel loop unrolling of the input channels and output channels is adopted to build the largest DSP computing array to ensure a reliable and full utilization of the limited computing resources, thus reducing the inference delay of the entire network. Meanwhile, a three-path cache queue and a small-scale cascaded pooling array are designed, which maximize the reuse of on-chip cache data, effectively reduce the bandwidth bottleneck of the external memory, and ensure an efficient computing of the entire computing array. The experimental results show that at the 200 MHz operating frequency of the VC709, the overall inference performance of the FPGA acceleration can reach 399.62 GOPS, the peak performance can reach 408.4 GOPS, and the overall computing efficiency of the DSP array can reach 97.56%. Compared with the previous work, our architecture design further improves the computing efficiency under limited hardware resources.</description><identifier>ISSN: 2079-9292</identifier><identifier>EISSN: 2079-9292</identifier><identifier>DOI: 10.3390/electronics11213473</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Acceleration ; Accuracy ; Aerospace environments ; Algorithms ; Bandwidths ; Channels ; Computer architecture ; Computing time ; Design and construction ; Design optimization ; Digital integrated circuits ; Efficiency ; Field programmable gate arrays ; Hardware ; Image processing ; Inference ; Neural networks ; Object recognition (Computers) ; Pattern recognition ; Power consumption ; Remote sensing ; Satellite imagery ; Satellites ; Target detection</subject><ispartof>Electronics (Basel), 2022-11, Vol.11 (21), p.3473</ispartof><rights>COPYRIGHT 2022 MDPI AG</rights><rights>2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c361t-4205989aa4a8927145c8a119652d9bf0f49a275fb620e36e6172e62a70d0e50a3</citedby><cites>FETCH-LOGICAL-c361t-4205989aa4a8927145c8a119652d9bf0f49a275fb620e36e6172e62a70d0e50a3</cites><orcidid>0000-0003-4925-5971</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27922,27923</link.rule.ids></links><search><creatorcontrib>Wang, Ling</creatorcontrib><creatorcontrib>Zhou, Hai</creatorcontrib><creatorcontrib>Bian, Chunjiang</creatorcontrib><creatorcontrib>Jiang, Kangning</creatorcontrib><creatorcontrib>Cheng, Xiaolei</creatorcontrib><title>Hardware Acceleration and Implementation of YOLOX-s for On-Orbit FPGA</title><title>Electronics (Basel)</title><description>The rapid development of remote sensing technology has brought about a sharp increase in the amount of remote sensing image data. However, due to the satellite’s limited hardware resources, space, and power consumption constraints, it is difficult to process massive remote sensing images efficiently and robustly using the traditional remote sensing image processing methods. Additionally, the task of satellite-to-ground target detection has higher requirements for speed and accuracy under the conditions of more and more remote sensing data. To solve these problems, this paper proposes an extremely efficient and reliable acceleration architecture for forward inference of the YOLOX-s detection network an on-orbit FPGA. Considering the limited onboard resources, the design strategy of the parallel loop unrolling of the input channels and output channels is adopted to build the largest DSP computing array to ensure a reliable and full utilization of the limited computing resources, thus reducing the inference delay of the entire network. Meanwhile, a three-path cache queue and a small-scale cascaded pooling array are designed, which maximize the reuse of on-chip cache data, effectively reduce the bandwidth bottleneck of the external memory, and ensure an efficient computing of the entire computing array. The experimental results show that at the 200 MHz operating frequency of the VC709, the overall inference performance of the FPGA acceleration can reach 399.62 GOPS, the peak performance can reach 408.4 GOPS, and the overall computing efficiency of the DSP array can reach 97.56%. Compared with the previous work, our architecture design further improves the computing efficiency under limited hardware resources.</description><subject>Acceleration</subject><subject>Accuracy</subject><subject>Aerospace environments</subject><subject>Algorithms</subject><subject>Bandwidths</subject><subject>Channels</subject><subject>Computer architecture</subject><subject>Computing time</subject><subject>Design and construction</subject><subject>Design optimization</subject><subject>Digital integrated circuits</subject><subject>Efficiency</subject><subject>Field programmable gate arrays</subject><subject>Hardware</subject><subject>Image processing</subject><subject>Inference</subject><subject>Neural networks</subject><subject>Object recognition (Computers)</subject><subject>Pattern recognition</subject><subject>Power consumption</subject><subject>Remote sensing</subject><subject>Satellite imagery</subject><subject>Satellites</subject><subject>Target detection</subject><issn>2079-9292</issn><issn>2079-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNptUE1LxDAQDaLgsu4v8FLw3DUfbdMcy7JfsFAPCnoqs-lEurTJmnQR_72RevDgzGGGx3tvhkfIPaNLIRR9xB716J3tdGCMM5FJcUVmnEqVKq749Z_9lixCONFYiolS0BlZ78C3n-AxqbSOTh7GztkEbJvsh3OPA9pxgpxJ3upD_ZqGxDif1Dat_bEbk83TtrojNwb6gIvfOScvm_Xzapce6u1-VR1SLQo2phmnuSoVQAal4pJluS6BMVXkvFVHQ02mgMvcHAtOURRYMMmx4CBpSzGnIObkYfI9e_dxwTA2J3fxNp5suBRZwVkpVGQtJ9Y79Nh01rjRg47d4tBpZ9F0Ea9klsdvKJVRICaB9i4Ej6Y5-24A_9Uw2vxk3PyTsfgGVGJvLQ</recordid><startdate>20221101</startdate><enddate>20221101</enddate><creator>Wang, Ling</creator><creator>Zhou, Hai</creator><creator>Bian, Chunjiang</creator><creator>Jiang, Kangning</creator><creator>Cheng, Xiaolei</creator><general>MDPI AG</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L7M</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><orcidid>https://orcid.org/0000-0003-4925-5971</orcidid></search><sort><creationdate>20221101</creationdate><title>Hardware Acceleration and Implementation of YOLOX-s for On-Orbit FPGA</title><author>Wang, Ling ; Zhou, Hai ; Bian, Chunjiang ; Jiang, Kangning ; Cheng, Xiaolei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c361t-4205989aa4a8927145c8a119652d9bf0f49a275fb620e36e6172e62a70d0e50a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Acceleration</topic><topic>Accuracy</topic><topic>Aerospace environments</topic><topic>Algorithms</topic><topic>Bandwidths</topic><topic>Channels</topic><topic>Computer architecture</topic><topic>Computing time</topic><topic>Design and construction</topic><topic>Design optimization</topic><topic>Digital integrated circuits</topic><topic>Efficiency</topic><topic>Field programmable gate arrays</topic><topic>Hardware</topic><topic>Image processing</topic><topic>Inference</topic><topic>Neural networks</topic><topic>Object recognition (Computers)</topic><topic>Pattern recognition</topic><topic>Power consumption</topic><topic>Remote sensing</topic><topic>Satellite imagery</topic><topic>Satellites</topic><topic>Target detection</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Ling</creatorcontrib><creatorcontrib>Zhou, Hai</creatorcontrib><creatorcontrib>Bian, Chunjiang</creatorcontrib><creatorcontrib>Jiang, Kangning</creatorcontrib><creatorcontrib>Cheng, Xiaolei</creatorcontrib><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Electronics (Basel)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Ling</au><au>Zhou, Hai</au><au>Bian, Chunjiang</au><au>Jiang, Kangning</au><au>Cheng, Xiaolei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hardware Acceleration and Implementation of YOLOX-s for On-Orbit FPGA</atitle><jtitle>Electronics (Basel)</jtitle><date>2022-11-01</date><risdate>2022</risdate><volume>11</volume><issue>21</issue><spage>3473</spage><pages>3473-</pages><issn>2079-9292</issn><eissn>2079-9292</eissn><abstract>The rapid development of remote sensing technology has brought about a sharp increase in the amount of remote sensing image data. However, due to the satellite’s limited hardware resources, space, and power consumption constraints, it is difficult to process massive remote sensing images efficiently and robustly using the traditional remote sensing image processing methods. Additionally, the task of satellite-to-ground target detection has higher requirements for speed and accuracy under the conditions of more and more remote sensing data. To solve these problems, this paper proposes an extremely efficient and reliable acceleration architecture for forward inference of the YOLOX-s detection network an on-orbit FPGA. Considering the limited onboard resources, the design strategy of the parallel loop unrolling of the input channels and output channels is adopted to build the largest DSP computing array to ensure a reliable and full utilization of the limited computing resources, thus reducing the inference delay of the entire network. Meanwhile, a three-path cache queue and a small-scale cascaded pooling array are designed, which maximize the reuse of on-chip cache data, effectively reduce the bandwidth bottleneck of the external memory, and ensure an efficient computing of the entire computing array. The experimental results show that at the 200 MHz operating frequency of the VC709, the overall inference performance of the FPGA acceleration can reach 399.62 GOPS, the peak performance can reach 408.4 GOPS, and the overall computing efficiency of the DSP array can reach 97.56%. Compared with the previous work, our architecture design further improves the computing efficiency under limited hardware resources.</abstract><cop>Basel</cop><pub>MDPI AG</pub><doi>10.3390/electronics11213473</doi><orcidid>https://orcid.org/0000-0003-4925-5971</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2079-9292 |
ispartof | Electronics (Basel), 2022-11, Vol.11 (21), p.3473 |
issn | 2079-9292 2079-9292 |
language | eng |
recordid | cdi_proquest_journals_2734621839 |
source | MDPI - Multidisciplinary Digital Publishing Institute; EZB-FREE-00999 freely available EZB journals |
subjects | Acceleration Accuracy Aerospace environments Algorithms Bandwidths Channels Computer architecture Computing time Design and construction Design optimization Digital integrated circuits Efficiency Field programmable gate arrays Hardware Image processing Inference Neural networks Object recognition (Computers) Pattern recognition Power consumption Remote sensing Satellite imagery Satellites Target detection |
title | Hardware Acceleration and Implementation of YOLOX-s for On-Orbit FPGA |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T00%3A11%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hardware%20Acceleration%20and%20Implementation%20of%20YOLOX-s%20for%20On-Orbit%20FPGA&rft.jtitle=Electronics%20(Basel)&rft.au=Wang,%20Ling&rft.date=2022-11-01&rft.volume=11&rft.issue=21&rft.spage=3473&rft.pages=3473-&rft.issn=2079-9292&rft.eissn=2079-9292&rft_id=info:doi/10.3390/electronics11213473&rft_dat=%3Cgale_proqu%3EA745598007%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2734621839&rft_id=info:pmid/&rft_galeid=A745598007&rfr_iscdi=true |