SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing

Detecting small objects in remote sensing images is meaningful challenging, especially when deploying existing object detection models on edge terminal devices with limited hardware resources. In this study, we present an efficient remote sensing object detection model named SCNet, based on the ultr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE geoscience and remote sensing letters 2024, Vol.21, p.1-5
Hauptverfasser: Zhu, Shiliang, Miao, Min
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 5
container_issue
container_start_page 1
container_title IEEE geoscience and remote sensing letters
container_volume 21
creator Zhu, Shiliang
Miao, Min
description Detecting small objects in remote sensing images is meaningful challenging, especially when deploying existing object detection models on edge terminal devices with limited hardware resources. In this study, we present an efficient remote sensing object detection model named SCNet, based on the ultra-lightweight (you only look once, YOLOv5n). To address the significant feature loss issue in small objects within the model's neck, we introduce the selective feature enhancement block (SFEB). The SFEB selectively processes a portion of feature maps that contribute more to semantic information extraction while retaining another portion, enabling us to extract rich semantic information while preserving crucial details information necessary for small object detection. Furthermore, we incorporate the contextual transformer block (CTB) at the neck and backbone junction, which enhances the model's ability to understand relationships and boundaries between objects and backgrounds by exploring contextual information in shallow-level feature maps. This improves the model's capability to detect challenging small and medium objects. Experimental results on the NWPU VHR-10 and DIOR datasets demonstrate the model's performance, achieving mean average precisions (mAPs) of 96.6% and 72.6% at IOU = 0.5. The model operates at 487 frames/s with a batch size of 32 (FPS32), requiring only 4.6 giga floating-point operations per second (GFLOPs) and 1.8 million params.
doi_str_mv 10.1109/LGRS.2023.3344937
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2909283692</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10366312</ieee_id><sourcerecordid>2909283692</sourcerecordid><originalsourceid>FETCH-LOGICAL-c246t-1d798605f6762f0411bba7e7467332cf43c138d2ad2fd014e41a6a4f5f89a5d93</originalsourceid><addsrcrecordid>eNpNkE1Lw0AQhhdRsFZ_gOBhwXPqfn94K7VWobTQKnhbtslsTbVJ3Wwp_nsT2oOXeefwvDPwIHRLyYBSYh-mk8VywAjjA86FsFyfoR6V0mREanre7UJm0pqPS3TVNBtCmDBG99BsOZpBesRDPC3Xn-kA3cS-KvA4hDIvoUp4vtpAnvATpDbKusJt41DHLxzqiBewrRPgJVRNWa2v0UXw3w3cnLKP3p_Hb6OXbDqfvI6G0yxnQqWMFtoaRWRQWrFABKWrldeghdKcszwInlNuCuYLFgpCBQjqlRdBBmO9LCzvo_vj3V2sf_bQJLep97FqXzpmiWWGK8taih6pPNZNEyG4XSy3Pv46SlynzXXaXKfNnbS1nbtjpwSAfzxXilPG_wB3N2fI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2909283692</pqid></control><display><type>article</type><title>SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing</title><source>IEEE Electronic Library (IEL)</source><creator>Zhu, Shiliang ; Miao, Min</creator><creatorcontrib>Zhu, Shiliang ; Miao, Min</creatorcontrib><description>Detecting small objects in remote sensing images is meaningful challenging, especially when deploying existing object detection models on edge terminal devices with limited hardware resources. In this study, we present an efficient remote sensing object detection model named SCNet, based on the ultra-lightweight (you only look once, YOLOv5n). To address the significant feature loss issue in small objects within the model's neck, we introduce the selective feature enhancement block (SFEB). The SFEB selectively processes a portion of feature maps that contribute more to semantic information extraction while retaining another portion, enabling us to extract rich semantic information while preserving crucial details information necessary for small object detection. Furthermore, we incorporate the contextual transformer block (CTB) at the neck and backbone junction, which enhances the model's ability to understand relationships and boundaries between objects and backgrounds by exploring contextual information in shallow-level feature maps. This improves the model's capability to detect challenging small and medium objects. Experimental results on the NWPU VHR-10 and DIOR datasets demonstrate the model's performance, achieving mean average precisions (mAPs) of 96.6% and 72.6% at IOU = 0.5. The model operates at 487 frames/s with a batch size of 32 (FPS32), requiring only 4.6 giga floating-point operations per second (GFLOPs) and 1.8 million params.</description><identifier>ISSN: 1545-598X</identifier><identifier>EISSN: 1558-0571</identifier><identifier>DOI: 10.1109/LGRS.2023.3344937</identifier><identifier>CODEN: IGRSBY</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Batch flotation ; Context modeling ; Data mining ; Detection ; Feature extraction ; Feature maps ; Floating point arithmetic ; Information processing ; Information retrieval ; Lightweight ; Lightweight model ; Modelling ; Neck ; object detection ; Object recognition ; Remote sensing ; Semantics ; YOLO ; you only look once (YOLO)</subject><ispartof>IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c246t-1d798605f6762f0411bba7e7467332cf43c138d2ad2fd014e41a6a4f5f89a5d93</cites><orcidid>0009-0003-9709-3492 ; 0000-0001-8205-6419</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10366312$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4010,27900,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10366312$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhu, Shiliang</creatorcontrib><creatorcontrib>Miao, Min</creatorcontrib><title>SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing</title><title>IEEE geoscience and remote sensing letters</title><addtitle>LGRS</addtitle><description>Detecting small objects in remote sensing images is meaningful challenging, especially when deploying existing object detection models on edge terminal devices with limited hardware resources. In this study, we present an efficient remote sensing object detection model named SCNet, based on the ultra-lightweight (you only look once, YOLOv5n). To address the significant feature loss issue in small objects within the model's neck, we introduce the selective feature enhancement block (SFEB). The SFEB selectively processes a portion of feature maps that contribute more to semantic information extraction while retaining another portion, enabling us to extract rich semantic information while preserving crucial details information necessary for small object detection. Furthermore, we incorporate the contextual transformer block (CTB) at the neck and backbone junction, which enhances the model's ability to understand relationships and boundaries between objects and backgrounds by exploring contextual information in shallow-level feature maps. This improves the model's capability to detect challenging small and medium objects. Experimental results on the NWPU VHR-10 and DIOR datasets demonstrate the model's performance, achieving mean average precisions (mAPs) of 96.6% and 72.6% at IOU = 0.5. The model operates at 487 frames/s with a batch size of 32 (FPS32), requiring only 4.6 giga floating-point operations per second (GFLOPs) and 1.8 million params.</description><subject>Batch flotation</subject><subject>Context modeling</subject><subject>Data mining</subject><subject>Detection</subject><subject>Feature extraction</subject><subject>Feature maps</subject><subject>Floating point arithmetic</subject><subject>Information processing</subject><subject>Information retrieval</subject><subject>Lightweight</subject><subject>Lightweight model</subject><subject>Modelling</subject><subject>Neck</subject><subject>object detection</subject><subject>Object recognition</subject><subject>Remote sensing</subject><subject>Semantics</subject><subject>YOLO</subject><subject>you only look once (YOLO)</subject><issn>1545-598X</issn><issn>1558-0571</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE1Lw0AQhhdRsFZ_gOBhwXPqfn94K7VWobTQKnhbtslsTbVJ3Wwp_nsT2oOXeefwvDPwIHRLyYBSYh-mk8VywAjjA86FsFyfoR6V0mREanre7UJm0pqPS3TVNBtCmDBG99BsOZpBesRDPC3Xn-kA3cS-KvA4hDIvoUp4vtpAnvATpDbKusJt41DHLxzqiBewrRPgJVRNWa2v0UXw3w3cnLKP3p_Hb6OXbDqfvI6G0yxnQqWMFtoaRWRQWrFABKWrldeghdKcszwInlNuCuYLFgpCBQjqlRdBBmO9LCzvo_vj3V2sf_bQJLep97FqXzpmiWWGK8taih6pPNZNEyG4XSy3Pv46SlynzXXaXKfNnbS1nbtjpwSAfzxXilPG_wB3N2fI</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Zhu, Shiliang</creator><creator>Miao, Min</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TG</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>JQ2</scope><scope>KL.</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0009-0003-9709-3492</orcidid><orcidid>https://orcid.org/0000-0001-8205-6419</orcidid></search><sort><creationdate>2024</creationdate><title>SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing</title><author>Zhu, Shiliang ; Miao, Min</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c246t-1d798605f6762f0411bba7e7467332cf43c138d2ad2fd014e41a6a4f5f89a5d93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Batch flotation</topic><topic>Context modeling</topic><topic>Data mining</topic><topic>Detection</topic><topic>Feature extraction</topic><topic>Feature maps</topic><topic>Floating point arithmetic</topic><topic>Information processing</topic><topic>Information retrieval</topic><topic>Lightweight</topic><topic>Lightweight model</topic><topic>Modelling</topic><topic>Neck</topic><topic>object detection</topic><topic>Object recognition</topic><topic>Remote sensing</topic><topic>Semantics</topic><topic>YOLO</topic><topic>you only look once (YOLO)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhu, Shiliang</creatorcontrib><creatorcontrib>Miao, Min</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Meteorological &amp; Geoastrophysical Abstracts</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy &amp; Non-Living Resources</collection><collection>ProQuest Computer Science Collection</collection><collection>Meteorological &amp; Geoastrophysical Abstracts - Academic</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE geoscience and remote sensing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhu, Shiliang</au><au>Miao, Min</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing</atitle><jtitle>IEEE geoscience and remote sensing letters</jtitle><stitle>LGRS</stitle><date>2024</date><risdate>2024</risdate><volume>21</volume><spage>1</spage><epage>5</epage><pages>1-5</pages><issn>1545-598X</issn><eissn>1558-0571</eissn><coden>IGRSBY</coden><abstract>Detecting small objects in remote sensing images is meaningful challenging, especially when deploying existing object detection models on edge terminal devices with limited hardware resources. In this study, we present an efficient remote sensing object detection model named SCNet, based on the ultra-lightweight (you only look once, YOLOv5n). To address the significant feature loss issue in small objects within the model's neck, we introduce the selective feature enhancement block (SFEB). The SFEB selectively processes a portion of feature maps that contribute more to semantic information extraction while retaining another portion, enabling us to extract rich semantic information while preserving crucial details information necessary for small object detection. Furthermore, we incorporate the contextual transformer block (CTB) at the neck and backbone junction, which enhances the model's ability to understand relationships and boundaries between objects and backgrounds by exploring contextual information in shallow-level feature maps. This improves the model's capability to detect challenging small and medium objects. Experimental results on the NWPU VHR-10 and DIOR datasets demonstrate the model's performance, achieving mean average precisions (mAPs) of 96.6% and 72.6% at IOU = 0.5. The model operates at 487 frames/s with a batch size of 32 (FPS32), requiring only 4.6 giga floating-point operations per second (GFLOPs) and 1.8 million params.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/LGRS.2023.3344937</doi><tpages>5</tpages><orcidid>https://orcid.org/0009-0003-9709-3492</orcidid><orcidid>https://orcid.org/0000-0001-8205-6419</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1545-598X
ispartof IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5
issn 1545-598X
1558-0571
language eng
recordid cdi_proquest_journals_2909283692
source IEEE Electronic Library (IEL)
subjects Batch flotation
Context modeling
Data mining
Detection
Feature extraction
Feature maps
Floating point arithmetic
Information processing
Information retrieval
Lightweight
Lightweight model
Modelling
Neck
object detection
Object recognition
Remote sensing
Semantics
YOLO
you only look once (YOLO)
title SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T08%3A22%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SCNet:%20A%20Lightweight%20and%20Efficient%20Object%20Detection%20Network%20for%20Remote%20Sensing&rft.jtitle=IEEE%20geoscience%20and%20remote%20sensing%20letters&rft.au=Zhu,%20Shiliang&rft.date=2024&rft.volume=21&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.issn=1545-598X&rft.eissn=1558-0571&rft.coden=IGRSBY&rft_id=info:doi/10.1109/LGRS.2023.3344937&rft_dat=%3Cproquest_RIE%3E2909283692%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2909283692&rft_id=info:pmid/&rft_ieee_id=10366312&rfr_iscdi=true