Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images

Object detection on drone images with low-latency is an important but challenging task on the resource-constrained unmanned aerial vehicle (UAV) platform. This paper investigates optimizing the detection head based on the sparse convolution, which proves effective in balancing the accuracy and effic...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Du, Bowei, Huang, Yecheng, Chen, Jiaxin, Huang, Di
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Du, Bowei
Huang, Yecheng
Chen, Jiaxin
Huang, Di
description Object detection on drone images with low-latency is an important but challenging task on the resource-constrained unmanned aerial vehicle (UAV) platform. This paper investigates optimizing the detection head based on the sparse convolution, which proves effective in balancing the accuracy and efficiency. Nevertheless, it suffers from inadequate integration of contextual information of tiny objects as well as clumsy control of the mask ratio in the presence of foreground with varying scales. To address the issues above, we propose a novel global context-enhanced adaptive sparse convolutional network (CEASC). It first develops a context-enhanced group normalization (CE-GN) layer, by replacing the statistics based on sparsely sampled features with the global contextual ones, and then designs an adaptive multi-layer masking strategy to generate optimal mask ratios at distinct scales for compact foreground coverage, promoting both the accuracy and efficiency. Extensive experimental results on two major benchmarks, i.e. VisDrone and UAVDT, demonstrate that CEASC remarkably reduces the GFLOPs and accelerates the inference procedure when plugging into the typical state-of-the-art detection frameworks (e.g. RetinaNet and GFL V1) with competitive performance. Code is available at https://github.com/Cuogeihong/CEASC.
doi_str_mv 10.48550/arxiv.2303.14488
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2303_14488</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2303_14488</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-32e8253628906f71921dd4213c3f7ac37f1f37663c4c0225e0d81ef3a285a0393</originalsourceid><addsrcrecordid>eNotj8tOwzAURLNhgQofwAr_QILtm4ezrNIHlap2QffRrXNNA4kdOaYtf08oSCMdaTQa6UTRk-BJqrKMv6C_tudEAodEpKlS95GfNziE9kzsbUA_EqucPbvuK7TOYsd2FC7Of47s0oYTW3fuOJXTJNA1sKU9odXUkw3MOM9WOAbybH_8IB3YgsKE6YZNWXhniW16fKfxIboz2I30-M9ZdFgtD9VrvN2vN9V8G2NeqBgkKZlBLlXJc1OIUoqmSaUADaZADYURBoo8B51qLmVGvFGCDKBUGXIoYRY9_93epOvBtz367_pXvr7Jww-WMFUm</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images</title><source>arXiv.org</source><creator>Du, Bowei ; Huang, Yecheng ; Chen, Jiaxin ; Huang, Di</creator><creatorcontrib>Du, Bowei ; Huang, Yecheng ; Chen, Jiaxin ; Huang, Di</creatorcontrib><description>Object detection on drone images with low-latency is an important but challenging task on the resource-constrained unmanned aerial vehicle (UAV) platform. This paper investigates optimizing the detection head based on the sparse convolution, which proves effective in balancing the accuracy and efficiency. Nevertheless, it suffers from inadequate integration of contextual information of tiny objects as well as clumsy control of the mask ratio in the presence of foreground with varying scales. To address the issues above, we propose a novel global context-enhanced adaptive sparse convolutional network (CEASC). It first develops a context-enhanced group normalization (CE-GN) layer, by replacing the statistics based on sparsely sampled features with the global contextual ones, and then designs an adaptive multi-layer masking strategy to generate optimal mask ratios at distinct scales for compact foreground coverage, promoting both the accuracy and efficiency. Extensive experimental results on two major benchmarks, i.e. VisDrone and UAVDT, demonstrate that CEASC remarkably reduces the GFLOPs and accelerates the inference procedure when plugging into the typical state-of-the-art detection frameworks (e.g. RetinaNet and GFL V1) with competitive performance. Code is available at https://github.com/Cuogeihong/CEASC.</description><identifier>DOI: 10.48550/arxiv.2303.14488</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2023-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,882</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2303.14488$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2303.14488$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Du, Bowei</creatorcontrib><creatorcontrib>Huang, Yecheng</creatorcontrib><creatorcontrib>Chen, Jiaxin</creatorcontrib><creatorcontrib>Huang, Di</creatorcontrib><title>Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images</title><description>Object detection on drone images with low-latency is an important but challenging task on the resource-constrained unmanned aerial vehicle (UAV) platform. This paper investigates optimizing the detection head based on the sparse convolution, which proves effective in balancing the accuracy and efficiency. Nevertheless, it suffers from inadequate integration of contextual information of tiny objects as well as clumsy control of the mask ratio in the presence of foreground with varying scales. To address the issues above, we propose a novel global context-enhanced adaptive sparse convolutional network (CEASC). It first develops a context-enhanced group normalization (CE-GN) layer, by replacing the statistics based on sparsely sampled features with the global contextual ones, and then designs an adaptive multi-layer masking strategy to generate optimal mask ratios at distinct scales for compact foreground coverage, promoting both the accuracy and efficiency. Extensive experimental results on two major benchmarks, i.e. VisDrone and UAVDT, demonstrate that CEASC remarkably reduces the GFLOPs and accelerates the inference procedure when plugging into the typical state-of-the-art detection frameworks (e.g. RetinaNet and GFL V1) with competitive performance. Code is available at https://github.com/Cuogeihong/CEASC.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8tOwzAURLNhgQofwAr_QILtm4ezrNIHlap2QffRrXNNA4kdOaYtf08oSCMdaTQa6UTRk-BJqrKMv6C_tudEAodEpKlS95GfNziE9kzsbUA_EqucPbvuK7TOYsd2FC7Of47s0oYTW3fuOJXTJNA1sKU9odXUkw3MOM9WOAbybH_8IB3YgsKE6YZNWXhniW16fKfxIboz2I30-M9ZdFgtD9VrvN2vN9V8G2NeqBgkKZlBLlXJc1OIUoqmSaUADaZADYURBoo8B51qLmVGvFGCDKBUGXIoYRY9_93epOvBtz367_pXvr7Jww-WMFUm</recordid><startdate>20230325</startdate><enddate>20230325</enddate><creator>Du, Bowei</creator><creator>Huang, Yecheng</creator><creator>Chen, Jiaxin</creator><creator>Huang, Di</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230325</creationdate><title>Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images</title><author>Du, Bowei ; Huang, Yecheng ; Chen, Jiaxin ; Huang, Di</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-32e8253628906f71921dd4213c3f7ac37f1f37663c4c0225e0d81ef3a285a0393</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Du, Bowei</creatorcontrib><creatorcontrib>Huang, Yecheng</creatorcontrib><creatorcontrib>Chen, Jiaxin</creatorcontrib><creatorcontrib>Huang, Di</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Du, Bowei</au><au>Huang, Yecheng</au><au>Chen, Jiaxin</au><au>Huang, Di</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images</atitle><date>2023-03-25</date><risdate>2023</risdate><abstract>Object detection on drone images with low-latency is an important but challenging task on the resource-constrained unmanned aerial vehicle (UAV) platform. This paper investigates optimizing the detection head based on the sparse convolution, which proves effective in balancing the accuracy and efficiency. Nevertheless, it suffers from inadequate integration of contextual information of tiny objects as well as clumsy control of the mask ratio in the presence of foreground with varying scales. To address the issues above, we propose a novel global context-enhanced adaptive sparse convolutional network (CEASC). It first develops a context-enhanced group normalization (CE-GN) layer, by replacing the statistics based on sparsely sampled features with the global contextual ones, and then designs an adaptive multi-layer masking strategy to generate optimal mask ratios at distinct scales for compact foreground coverage, promoting both the accuracy and efficiency. Extensive experimental results on two major benchmarks, i.e. VisDrone and UAVDT, demonstrate that CEASC remarkably reduces the GFLOPs and accelerates the inference procedure when plugging into the typical state-of-the-art detection frameworks (e.g. RetinaNet and GFL V1) with competitive performance. Code is available at https://github.com/Cuogeihong/CEASC.</abstract><doi>10.48550/arxiv.2303.14488</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2303.14488
ispartof
issn
language eng
recordid cdi_arxiv_primary_2303_14488
source arXiv.org
subjects Computer Science - Computer Vision and Pattern Recognition
title Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T13%3A08%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Adaptive%20Sparse%20Convolutional%20Networks%20with%20Global%20Context%20Enhancement%20for%20Faster%20Object%20Detection%20on%20Drone%20Images&rft.au=Du,%20Bowei&rft.date=2023-03-25&rft_id=info:doi/10.48550/arxiv.2303.14488&rft_dat=%3Carxiv_GOX%3E2303_14488%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true