Algorithm-hardware Co-optimization for Energy-efficient Drone Detection on Resource-constrained FPGA

Convolutional neural network (CNN)-based object detection has achieved very high accuracy; e.g., single-shot multi-box detectors (SSDs) can efficiently detect and localize various objects in an input image. However, they require a high amount of computation and memory storage, which makes it difficu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on reconfigurable technology and systems 2023-05, Vol.16 (2), p.1-25, Article 33
Hauptverfasser: Suh, Han-Sok, Meng, Jian, Nguyen, Ty, Kumar, Vijay, Cao, Yu, Seo, Jae-Sun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 25
container_issue 2
container_start_page 1
container_title ACM transactions on reconfigurable technology and systems
container_volume 16
creator Suh, Han-Sok
Meng, Jian
Nguyen, Ty
Kumar, Vijay
Cao, Yu
Seo, Jae-Sun
description Convolutional neural network (CNN)-based object detection has achieved very high accuracy; e.g., single-shot multi-box detectors (SSDs) can efficiently detect and localize various objects in an input image. However, they require a high amount of computation and memory storage, which makes it difficult to perform efficient inference on resource-constrained hardware devices such as drones or unmanned aerial vehicles (UAVs). Drone/UAV detection is an important task for applications including surveillance, defense, and multi-drone self-localization and formation control. In this article, we designed and co-optimized an algorithm and hardware for energy-efficient drone detection on resource-constrained FPGA devices. We trained an SSD object detection algorithm with a custom drone dataset. For inference, we employed low-precision quantization and adapted the width of the SSD CNN model. To improve throughput, we use dual-data rate operations for DSPs to effectively double the throughput with limited DSP counts. For different SSD algorithm models, we analyze accuracy or mean average precision (mAP) and evaluate the corresponding FPGA hardware utilization, DRAM communication, and throughput optimization. We evaluated the FPGA hardware for a custom drone dataset, Pascal VOC, and COCO2017. Our proposed design achieves a high mAP of 88.42% on the multi-drone dataset, with a high energy efficiency of 79 GOPS/W and throughput of 158 GOPS using the Xilinx Zynq ZU3EG FPGA device on the Open Vision Computer version 3 (OVC3) platform. Our design achieves 1.1 to 8.7× higher energy efficiency than prior works that used the same Pascal VOC dataset, using the same FPGA device, but at a low-power consumption of 2.54 W. For the COCO dataset, our MobileNet-V1 implementation achieved an mAP of 16.8, and 4.9 FPS/W for energy-efficiency, which is ∼ 1.9× higher than prior FPGA works or other commercial hardware platforms.
doi_str_mv 10.1145/3583074
format Article
fullrecord <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3583074</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3583074</sourcerecordid><originalsourceid>FETCH-LOGICAL-a277t-6b1c4f94f5ba6580d0746cc89afae8e02a447f76f9d3ca06654ffcb2fd0be2c33</originalsourceid><addsrcrecordid>eNo9kF1LwzAUhoMoOKd471XvvIomTZq2l2VfCgNF9Lqk6TlbZG1GEpH561fdHBx4D7wPB85DyC1nD5zL7FFkhWC5PCMjXgpFc8nl-Wln6pJchfDJmBKqkCPSVpuV8zauO7rWvv3WHpKJo24bbWd_dLSuT9D5ZNaDX-0oIFpjoY_J1LsekilEMH_QMG8Q3Jc3QI3rQ_Ta9tAm89dFdU0uUG8C3BxzTD7ms_fJE12-LJ4n1ZLqNM8jVQ03EkuJWaNVVrB2eEMZU5QaNRTAUi1ljrnCshVGM6UyiWiaFFvWQGqEGJP7w13jXQgesN5622m_qzmrf-XURzkDeXcgtelO0H-5B1muYMI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Algorithm-hardware Co-optimization for Energy-efficient Drone Detection on Resource-constrained FPGA</title><source>ACM Digital Library Complete</source><creator>Suh, Han-Sok ; Meng, Jian ; Nguyen, Ty ; Kumar, Vijay ; Cao, Yu ; Seo, Jae-Sun</creator><creatorcontrib>Suh, Han-Sok ; Meng, Jian ; Nguyen, Ty ; Kumar, Vijay ; Cao, Yu ; Seo, Jae-Sun</creatorcontrib><description>Convolutional neural network (CNN)-based object detection has achieved very high accuracy; e.g., single-shot multi-box detectors (SSDs) can efficiently detect and localize various objects in an input image. However, they require a high amount of computation and memory storage, which makes it difficult to perform efficient inference on resource-constrained hardware devices such as drones or unmanned aerial vehicles (UAVs). Drone/UAV detection is an important task for applications including surveillance, defense, and multi-drone self-localization and formation control. In this article, we designed and co-optimized an algorithm and hardware for energy-efficient drone detection on resource-constrained FPGA devices. We trained an SSD object detection algorithm with a custom drone dataset. For inference, we employed low-precision quantization and adapted the width of the SSD CNN model. To improve throughput, we use dual-data rate operations for DSPs to effectively double the throughput with limited DSP counts. For different SSD algorithm models, we analyze accuracy or mean average precision (mAP) and evaluate the corresponding FPGA hardware utilization, DRAM communication, and throughput optimization. We evaluated the FPGA hardware for a custom drone dataset, Pascal VOC, and COCO2017. Our proposed design achieves a high mAP of 88.42% on the multi-drone dataset, with a high energy efficiency of 79 GOPS/W and throughput of 158 GOPS using the Xilinx Zynq ZU3EG FPGA device on the Open Vision Computer version 3 (OVC3) platform. Our design achieves 1.1 to 8.7× higher energy efficiency than prior works that used the same Pascal VOC dataset, using the same FPGA device, but at a low-power consumption of 2.54 W. For the COCO dataset, our MobileNet-V1 implementation achieved an mAP of 16.8, and 4.9 FPS/W for energy-efficiency, which is ∼ 1.9× higher than prior FPGA works or other commercial hardware platforms.</description><identifier>ISSN: 1936-7406</identifier><identifier>EISSN: 1936-7414</identifier><identifier>DOI: 10.1145/3583074</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Computer systems organization ; Neural networks ; Reconfigurable computing</subject><ispartof>ACM transactions on reconfigurable technology and systems, 2023-05, Vol.16 (2), p.1-25, Article 33</ispartof><rights>Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a277t-6b1c4f94f5ba6580d0746cc89afae8e02a447f76f9d3ca06654ffcb2fd0be2c33</citedby><cites>FETCH-LOGICAL-a277t-6b1c4f94f5ba6580d0746cc89afae8e02a447f76f9d3ca06654ffcb2fd0be2c33</cites><orcidid>0000-0002-7703-5020 ; 0000-0002-3902-9391 ; 0000-0001-6968-1180 ; 0000-0002-4551-7789 ; 0000-0002-4466-4824 ; 0000-0002-6351-8865</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3583074$$EPDF$$P50$$Gacm$$H</linktopdf><link.rule.ids>314,780,784,2282,27924,27925,40196,76228</link.rule.ids></links><search><creatorcontrib>Suh, Han-Sok</creatorcontrib><creatorcontrib>Meng, Jian</creatorcontrib><creatorcontrib>Nguyen, Ty</creatorcontrib><creatorcontrib>Kumar, Vijay</creatorcontrib><creatorcontrib>Cao, Yu</creatorcontrib><creatorcontrib>Seo, Jae-Sun</creatorcontrib><title>Algorithm-hardware Co-optimization for Energy-efficient Drone Detection on Resource-constrained FPGA</title><title>ACM transactions on reconfigurable technology and systems</title><addtitle>ACM TRETS</addtitle><description>Convolutional neural network (CNN)-based object detection has achieved very high accuracy; e.g., single-shot multi-box detectors (SSDs) can efficiently detect and localize various objects in an input image. However, they require a high amount of computation and memory storage, which makes it difficult to perform efficient inference on resource-constrained hardware devices such as drones or unmanned aerial vehicles (UAVs). Drone/UAV detection is an important task for applications including surveillance, defense, and multi-drone self-localization and formation control. In this article, we designed and co-optimized an algorithm and hardware for energy-efficient drone detection on resource-constrained FPGA devices. We trained an SSD object detection algorithm with a custom drone dataset. For inference, we employed low-precision quantization and adapted the width of the SSD CNN model. To improve throughput, we use dual-data rate operations for DSPs to effectively double the throughput with limited DSP counts. For different SSD algorithm models, we analyze accuracy or mean average precision (mAP) and evaluate the corresponding FPGA hardware utilization, DRAM communication, and throughput optimization. We evaluated the FPGA hardware for a custom drone dataset, Pascal VOC, and COCO2017. Our proposed design achieves a high mAP of 88.42% on the multi-drone dataset, with a high energy efficiency of 79 GOPS/W and throughput of 158 GOPS using the Xilinx Zynq ZU3EG FPGA device on the Open Vision Computer version 3 (OVC3) platform. Our design achieves 1.1 to 8.7× higher energy efficiency than prior works that used the same Pascal VOC dataset, using the same FPGA device, but at a low-power consumption of 2.54 W. For the COCO dataset, our MobileNet-V1 implementation achieved an mAP of 16.8, and 4.9 FPS/W for energy-efficiency, which is ∼ 1.9× higher than prior FPGA works or other commercial hardware platforms.</description><subject>Computer systems organization</subject><subject>Neural networks</subject><subject>Reconfigurable computing</subject><issn>1936-7406</issn><issn>1936-7414</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNo9kF1LwzAUhoMoOKd471XvvIomTZq2l2VfCgNF9Lqk6TlbZG1GEpH561fdHBx4D7wPB85DyC1nD5zL7FFkhWC5PCMjXgpFc8nl-Wln6pJchfDJmBKqkCPSVpuV8zauO7rWvv3WHpKJo24bbWd_dLSuT9D5ZNaDX-0oIFpjoY_J1LsekilEMH_QMG8Q3Jc3QI3rQ_Ta9tAm89dFdU0uUG8C3BxzTD7ms_fJE12-LJ4n1ZLqNM8jVQ03EkuJWaNVVrB2eEMZU5QaNRTAUi1ljrnCshVGM6UyiWiaFFvWQGqEGJP7w13jXQgesN5622m_qzmrf-XURzkDeXcgtelO0H-5B1muYMI</recordid><startdate>20230510</startdate><enddate>20230510</enddate><creator>Suh, Han-Sok</creator><creator>Meng, Jian</creator><creator>Nguyen, Ty</creator><creator>Kumar, Vijay</creator><creator>Cao, Yu</creator><creator>Seo, Jae-Sun</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-7703-5020</orcidid><orcidid>https://orcid.org/0000-0002-3902-9391</orcidid><orcidid>https://orcid.org/0000-0001-6968-1180</orcidid><orcidid>https://orcid.org/0000-0002-4551-7789</orcidid><orcidid>https://orcid.org/0000-0002-4466-4824</orcidid><orcidid>https://orcid.org/0000-0002-6351-8865</orcidid></search><sort><creationdate>20230510</creationdate><title>Algorithm-hardware Co-optimization for Energy-efficient Drone Detection on Resource-constrained FPGA</title><author>Suh, Han-Sok ; Meng, Jian ; Nguyen, Ty ; Kumar, Vijay ; Cao, Yu ; Seo, Jae-Sun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a277t-6b1c4f94f5ba6580d0746cc89afae8e02a447f76f9d3ca06654ffcb2fd0be2c33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer systems organization</topic><topic>Neural networks</topic><topic>Reconfigurable computing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Suh, Han-Sok</creatorcontrib><creatorcontrib>Meng, Jian</creatorcontrib><creatorcontrib>Nguyen, Ty</creatorcontrib><creatorcontrib>Kumar, Vijay</creatorcontrib><creatorcontrib>Cao, Yu</creatorcontrib><creatorcontrib>Seo, Jae-Sun</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on reconfigurable technology and systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Suh, Han-Sok</au><au>Meng, Jian</au><au>Nguyen, Ty</au><au>Kumar, Vijay</au><au>Cao, Yu</au><au>Seo, Jae-Sun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Algorithm-hardware Co-optimization for Energy-efficient Drone Detection on Resource-constrained FPGA</atitle><jtitle>ACM transactions on reconfigurable technology and systems</jtitle><stitle>ACM TRETS</stitle><date>2023-05-10</date><risdate>2023</risdate><volume>16</volume><issue>2</issue><spage>1</spage><epage>25</epage><pages>1-25</pages><artnum>33</artnum><issn>1936-7406</issn><eissn>1936-7414</eissn><abstract>Convolutional neural network (CNN)-based object detection has achieved very high accuracy; e.g., single-shot multi-box detectors (SSDs) can efficiently detect and localize various objects in an input image. However, they require a high amount of computation and memory storage, which makes it difficult to perform efficient inference on resource-constrained hardware devices such as drones or unmanned aerial vehicles (UAVs). Drone/UAV detection is an important task for applications including surveillance, defense, and multi-drone self-localization and formation control. In this article, we designed and co-optimized an algorithm and hardware for energy-efficient drone detection on resource-constrained FPGA devices. We trained an SSD object detection algorithm with a custom drone dataset. For inference, we employed low-precision quantization and adapted the width of the SSD CNN model. To improve throughput, we use dual-data rate operations for DSPs to effectively double the throughput with limited DSP counts. For different SSD algorithm models, we analyze accuracy or mean average precision (mAP) and evaluate the corresponding FPGA hardware utilization, DRAM communication, and throughput optimization. We evaluated the FPGA hardware for a custom drone dataset, Pascal VOC, and COCO2017. Our proposed design achieves a high mAP of 88.42% on the multi-drone dataset, with a high energy efficiency of 79 GOPS/W and throughput of 158 GOPS using the Xilinx Zynq ZU3EG FPGA device on the Open Vision Computer version 3 (OVC3) platform. Our design achieves 1.1 to 8.7× higher energy efficiency than prior works that used the same Pascal VOC dataset, using the same FPGA device, but at a low-power consumption of 2.54 W. For the COCO dataset, our MobileNet-V1 implementation achieved an mAP of 16.8, and 4.9 FPS/W for energy-efficiency, which is ∼ 1.9× higher than prior FPGA works or other commercial hardware platforms.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3583074</doi><tpages>25</tpages><orcidid>https://orcid.org/0000-0002-7703-5020</orcidid><orcidid>https://orcid.org/0000-0002-3902-9391</orcidid><orcidid>https://orcid.org/0000-0001-6968-1180</orcidid><orcidid>https://orcid.org/0000-0002-4551-7789</orcidid><orcidid>https://orcid.org/0000-0002-4466-4824</orcidid><orcidid>https://orcid.org/0000-0002-6351-8865</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1936-7406
ispartof ACM transactions on reconfigurable technology and systems, 2023-05, Vol.16 (2), p.1-25, Article 33
issn 1936-7406
1936-7414
language eng
recordid cdi_crossref_primary_10_1145_3583074
source ACM Digital Library Complete
subjects Computer systems organization
Neural networks
Reconfigurable computing
title Algorithm-hardware Co-optimization for Energy-efficient Drone Detection on Resource-constrained FPGA
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T12%3A20%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Algorithm-hardware%20Co-optimization%20for%20Energy-efficient%20Drone%20Detection%20on%20Resource-constrained%20FPGA&rft.jtitle=ACM%20transactions%20on%20reconfigurable%20technology%20and%20systems&rft.au=Suh,%20Han-Sok&rft.date=2023-05-10&rft.volume=16&rft.issue=2&rft.spage=1&rft.epage=25&rft.pages=1-25&rft.artnum=33&rft.issn=1936-7406&rft.eissn=1936-7414&rft_id=info:doi/10.1145/3583074&rft_dat=%3Cacm_cross%3E3583074%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true