SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing
Detecting small objects in remote sensing images is meaningful challenging, especially when deploying existing object detection models on edge terminal devices with limited hardware resources. In this study, we present an efficient remote sensing object detection model named SCNet, based on the ultr...
Gespeichert in:
Veröffentlicht in: | IEEE geoscience and remote sensing letters 2024, Vol.21, p.1-5 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 5 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | IEEE geoscience and remote sensing letters |
container_volume | 21 |
creator | Zhu, Shiliang Miao, Min |
description | Detecting small objects in remote sensing images is meaningful challenging, especially when deploying existing object detection models on edge terminal devices with limited hardware resources. In this study, we present an efficient remote sensing object detection model named SCNet, based on the ultra-lightweight (you only look once, YOLOv5n). To address the significant feature loss issue in small objects within the model's neck, we introduce the selective feature enhancement block (SFEB). The SFEB selectively processes a portion of feature maps that contribute more to semantic information extraction while retaining another portion, enabling us to extract rich semantic information while preserving crucial details information necessary for small object detection. Furthermore, we incorporate the contextual transformer block (CTB) at the neck and backbone junction, which enhances the model's ability to understand relationships and boundaries between objects and backgrounds by exploring contextual information in shallow-level feature maps. This improves the model's capability to detect challenging small and medium objects. Experimental results on the NWPU VHR-10 and DIOR datasets demonstrate the model's performance, achieving mean average precisions (mAPs) of 96.6% and 72.6% at IOU = 0.5. The model operates at 487 frames/s with a batch size of 32 (FPS32), requiring only 4.6 giga floating-point operations per second (GFLOPs) and 1.8 million params. |
doi_str_mv | 10.1109/LGRS.2023.3344937 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2909283692</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10366312</ieee_id><sourcerecordid>2909283692</sourcerecordid><originalsourceid>FETCH-LOGICAL-c246t-1d798605f6762f0411bba7e7467332cf43c138d2ad2fd014e41a6a4f5f89a5d93</originalsourceid><addsrcrecordid>eNpNkE1Lw0AQhhdRsFZ_gOBhwXPqfn94K7VWobTQKnhbtslsTbVJ3Wwp_nsT2oOXeefwvDPwIHRLyYBSYh-mk8VywAjjA86FsFyfoR6V0mREanre7UJm0pqPS3TVNBtCmDBG99BsOZpBesRDPC3Xn-kA3cS-KvA4hDIvoUp4vtpAnvATpDbKusJt41DHLxzqiBewrRPgJVRNWa2v0UXw3w3cnLKP3p_Hb6OXbDqfvI6G0yxnQqWMFtoaRWRQWrFABKWrldeghdKcszwInlNuCuYLFgpCBQjqlRdBBmO9LCzvo_vj3V2sf_bQJLep97FqXzpmiWWGK8taih6pPNZNEyG4XSy3Pv46SlynzXXaXKfNnbS1nbtjpwSAfzxXilPG_wB3N2fI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2909283692</pqid></control><display><type>article</type><title>SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing</title><source>IEEE Electronic Library (IEL)</source><creator>Zhu, Shiliang ; Miao, Min</creator><creatorcontrib>Zhu, Shiliang ; Miao, Min</creatorcontrib><description>Detecting small objects in remote sensing images is meaningful challenging, especially when deploying existing object detection models on edge terminal devices with limited hardware resources. In this study, we present an efficient remote sensing object detection model named SCNet, based on the ultra-lightweight (you only look once, YOLOv5n). To address the significant feature loss issue in small objects within the model's neck, we introduce the selective feature enhancement block (SFEB). The SFEB selectively processes a portion of feature maps that contribute more to semantic information extraction while retaining another portion, enabling us to extract rich semantic information while preserving crucial details information necessary for small object detection. Furthermore, we incorporate the contextual transformer block (CTB) at the neck and backbone junction, which enhances the model's ability to understand relationships and boundaries between objects and backgrounds by exploring contextual information in shallow-level feature maps. This improves the model's capability to detect challenging small and medium objects. Experimental results on the NWPU VHR-10 and DIOR datasets demonstrate the model's performance, achieving mean average precisions (mAPs) of 96.6% and 72.6% at IOU = 0.5. The model operates at 487 frames/s with a batch size of 32 (FPS32), requiring only 4.6 giga floating-point operations per second (GFLOPs) and 1.8 million params.</description><identifier>ISSN: 1545-598X</identifier><identifier>EISSN: 1558-0571</identifier><identifier>DOI: 10.1109/LGRS.2023.3344937</identifier><identifier>CODEN: IGRSBY</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Batch flotation ; Context modeling ; Data mining ; Detection ; Feature extraction ; Feature maps ; Floating point arithmetic ; Information processing ; Information retrieval ; Lightweight ; Lightweight model ; Modelling ; Neck ; object detection ; Object recognition ; Remote sensing ; Semantics ; YOLO ; you only look once (YOLO)</subject><ispartof>IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c246t-1d798605f6762f0411bba7e7467332cf43c138d2ad2fd014e41a6a4f5f89a5d93</cites><orcidid>0009-0003-9709-3492 ; 0000-0001-8205-6419</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10366312$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4010,27900,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10366312$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhu, Shiliang</creatorcontrib><creatorcontrib>Miao, Min</creatorcontrib><title>SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing</title><title>IEEE geoscience and remote sensing letters</title><addtitle>LGRS</addtitle><description>Detecting small objects in remote sensing images is meaningful challenging, especially when deploying existing object detection models on edge terminal devices with limited hardware resources. In this study, we present an efficient remote sensing object detection model named SCNet, based on the ultra-lightweight (you only look once, YOLOv5n). To address the significant feature loss issue in small objects within the model's neck, we introduce the selective feature enhancement block (SFEB). The SFEB selectively processes a portion of feature maps that contribute more to semantic information extraction while retaining another portion, enabling us to extract rich semantic information while preserving crucial details information necessary for small object detection. Furthermore, we incorporate the contextual transformer block (CTB) at the neck and backbone junction, which enhances the model's ability to understand relationships and boundaries between objects and backgrounds by exploring contextual information in shallow-level feature maps. This improves the model's capability to detect challenging small and medium objects. Experimental results on the NWPU VHR-10 and DIOR datasets demonstrate the model's performance, achieving mean average precisions (mAPs) of 96.6% and 72.6% at IOU = 0.5. The model operates at 487 frames/s with a batch size of 32 (FPS32), requiring only 4.6 giga floating-point operations per second (GFLOPs) and 1.8 million params.</description><subject>Batch flotation</subject><subject>Context modeling</subject><subject>Data mining</subject><subject>Detection</subject><subject>Feature extraction</subject><subject>Feature maps</subject><subject>Floating point arithmetic</subject><subject>Information processing</subject><subject>Information retrieval</subject><subject>Lightweight</subject><subject>Lightweight model</subject><subject>Modelling</subject><subject>Neck</subject><subject>object detection</subject><subject>Object recognition</subject><subject>Remote sensing</subject><subject>Semantics</subject><subject>YOLO</subject><subject>you only look once (YOLO)</subject><issn>1545-598X</issn><issn>1558-0571</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE1Lw0AQhhdRsFZ_gOBhwXPqfn94K7VWobTQKnhbtslsTbVJ3Wwp_nsT2oOXeefwvDPwIHRLyYBSYh-mk8VywAjjA86FsFyfoR6V0mREanre7UJm0pqPS3TVNBtCmDBG99BsOZpBesRDPC3Xn-kA3cS-KvA4hDIvoUp4vtpAnvATpDbKusJt41DHLxzqiBewrRPgJVRNWa2v0UXw3w3cnLKP3p_Hb6OXbDqfvI6G0yxnQqWMFtoaRWRQWrFABKWrldeghdKcszwInlNuCuYLFgpCBQjqlRdBBmO9LCzvo_vj3V2sf_bQJLep97FqXzpmiWWGK8taih6pPNZNEyG4XSy3Pv46SlynzXXaXKfNnbS1nbtjpwSAfzxXilPG_wB3N2fI</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Zhu, Shiliang</creator><creator>Miao, Min</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TG</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>JQ2</scope><scope>KL.</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0009-0003-9709-3492</orcidid><orcidid>https://orcid.org/0000-0001-8205-6419</orcidid></search><sort><creationdate>2024</creationdate><title>SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing</title><author>Zhu, Shiliang ; Miao, Min</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c246t-1d798605f6762f0411bba7e7467332cf43c138d2ad2fd014e41a6a4f5f89a5d93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Batch flotation</topic><topic>Context modeling</topic><topic>Data mining</topic><topic>Detection</topic><topic>Feature extraction</topic><topic>Feature maps</topic><topic>Floating point arithmetic</topic><topic>Information processing</topic><topic>Information retrieval</topic><topic>Lightweight</topic><topic>Lightweight model</topic><topic>Modelling</topic><topic>Neck</topic><topic>object detection</topic><topic>Object recognition</topic><topic>Remote sensing</topic><topic>Semantics</topic><topic>YOLO</topic><topic>you only look once (YOLO)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhu, Shiliang</creatorcontrib><creatorcontrib>Miao, Min</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Meteorological & Geoastrophysical Abstracts</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources</collection><collection>ProQuest Computer Science Collection</collection><collection>Meteorological & Geoastrophysical Abstracts - Academic</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE geoscience and remote sensing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhu, Shiliang</au><au>Miao, Min</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing</atitle><jtitle>IEEE geoscience and remote sensing letters</jtitle><stitle>LGRS</stitle><date>2024</date><risdate>2024</risdate><volume>21</volume><spage>1</spage><epage>5</epage><pages>1-5</pages><issn>1545-598X</issn><eissn>1558-0571</eissn><coden>IGRSBY</coden><abstract>Detecting small objects in remote sensing images is meaningful challenging, especially when deploying existing object detection models on edge terminal devices with limited hardware resources. In this study, we present an efficient remote sensing object detection model named SCNet, based on the ultra-lightweight (you only look once, YOLOv5n). To address the significant feature loss issue in small objects within the model's neck, we introduce the selective feature enhancement block (SFEB). The SFEB selectively processes a portion of feature maps that contribute more to semantic information extraction while retaining another portion, enabling us to extract rich semantic information while preserving crucial details information necessary for small object detection. Furthermore, we incorporate the contextual transformer block (CTB) at the neck and backbone junction, which enhances the model's ability to understand relationships and boundaries between objects and backgrounds by exploring contextual information in shallow-level feature maps. This improves the model's capability to detect challenging small and medium objects. Experimental results on the NWPU VHR-10 and DIOR datasets demonstrate the model's performance, achieving mean average precisions (mAPs) of 96.6% and 72.6% at IOU = 0.5. The model operates at 487 frames/s with a batch size of 32 (FPS32), requiring only 4.6 giga floating-point operations per second (GFLOPs) and 1.8 million params.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/LGRS.2023.3344937</doi><tpages>5</tpages><orcidid>https://orcid.org/0009-0003-9709-3492</orcidid><orcidid>https://orcid.org/0000-0001-8205-6419</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1545-598X |
ispartof | IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5 |
issn | 1545-598X 1558-0571 |
language | eng |
recordid | cdi_proquest_journals_2909283692 |
source | IEEE Electronic Library (IEL) |
subjects | Batch flotation Context modeling Data mining Detection Feature extraction Feature maps Floating point arithmetic Information processing Information retrieval Lightweight Lightweight model Modelling Neck object detection Object recognition Remote sensing Semantics YOLO you only look once (YOLO) |
title | SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T08%3A22%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SCNet:%20A%20Lightweight%20and%20Efficient%20Object%20Detection%20Network%20for%20Remote%20Sensing&rft.jtitle=IEEE%20geoscience%20and%20remote%20sensing%20letters&rft.au=Zhu,%20Shiliang&rft.date=2024&rft.volume=21&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.issn=1545-598X&rft.eissn=1558-0571&rft.coden=IGRSBY&rft_id=info:doi/10.1109/LGRS.2023.3344937&rft_dat=%3Cproquest_RIE%3E2909283692%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2909283692&rft_id=info:pmid/&rft_ieee_id=10366312&rfr_iscdi=true |