A Robust Feature Downsampling Module for Remote Sensing Visual Tasks

Remote sensing (RS) images present unique challenges for computer vision due to lower resolution, smaller objects, and fewer features. Mainstream backbone networks show promising results for traditional visual tasks. However, they use convolution to reduce feature map dimensionality, which can resul...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on geoscience and remote sensing 2023-01, Vol.61, p.1-1
Hauptverfasser:	Lu, Wei, Chen, Si-Bao, Tang, Jin, Ding, Chris H. Q., Luo, Bin
Format:	Artikel
Sprache:	eng
Schlagworte:	Classification Computer networks Computer vision Convolution Datasets Detection Feature downsample Feature extraction Feature maps Frequency locked loops Image classification Image processing Image segmentation Modules Object detection Object recognition Remote sensing Robustness Segmentation Semantic segmentation Task analysis Transformers Visual tasks Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1
container_issue
container_start_page	1
container_title	IEEE transactions on geoscience and remote sensing
container_volume	61
creator	Lu, Wei Chen, Si-Bao Tang, Jin Ding, Chris H. Q. Luo, Bin
description	Remote sensing (RS) images present unique challenges for computer vision due to lower resolution, smaller objects, and fewer features. Mainstream backbone networks show promising results for traditional visual tasks. However, they use convolution to reduce feature map dimensionality, which can result in information loss for small objects in RS images and decreased performance. To address this problem, we propose a new and universal downsampling module named Robust Feature Downsampling (RFD). RFD fuses multiple feature maps extracted by different downsampling techniques, creating a more robust feature map with a complementary set of features. Leveraging this, we overcome the limitations of conventional convolutional downsampling, resulting in more accurate and robust analysis of RS images. We develop two versions of RFD module, Shallow RFD (SRFD) and Deep RFD (DRFD), tailored to adapt to different stages of feature capture and improve feature robustness. We replace the downsampling layers of existing mainstream backbones with RFD module and conduct comparative experiments on several public RS image datasets. The results show significant improvements compared to baseline approaches in RS image classification, object detection, and semantic segmentation. Specifically, our RFD module achieved an average performance gain of 1.5% on NWPU-RESISC45 classification dataset without utilizing any additional pretraining data, resulting in state-of-the-art performance on this dataset. Moreover, in detection and segmentation tasks on DOTA and iSAID datasets, our RFD module outperforms the baseline approaches by 2-7% when utilizing pretraining data from NWPU-RESISC45. These results highlight the value of RFD module in enhancing the performance of RS visual tasks.
doi_str_mv	10.1109/TGRS.2023.3282048
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_10142024</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10142024</ieee_id><sourcerecordid>2826476261</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-ae6844722441196df6c31bfc4df8d246168f43cf55fc5e180a8ae47c2cd4604c3</originalsourceid><addsrcrecordid>eNpNkM9LwzAUx4MoOKd_gOAh4LkzSV_T9Dg2N4WJsE2vIUtfpLNrZtIi_ve2zIOnd3if7_vxIeSWswnnrHjYLtebiWAinaRCCQbqjIx4lqmESYBzMmK8kIlQhbgkVzHuGeOQ8XxE5lO69rsutnSBpu0C0rn_bqI5HOuq-aAvvuxqpM4HusaDb5FusIlD572Knanp1sTPeE0unKkj3vzVMXlbPG5nT8nqdfk8m64SKwpoE4NSAeRCAPD-nNJJm_Kds1A6VQqQXCoHqXVZ5myGXDGjDEJuhS1BMrDpmNyf5h6D_-owtnrvu9D0K3X_tIRcCsl7ip8oG3yMAZ0-hupgwo_mTA-y9CBLD7L0n6w-c3fKVIj4j-fQY5D-Al5mZNA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2826476261</pqid></control><display><type>article</type><title>A Robust Feature Downsampling Module for Remote Sensing Visual Tasks</title><source>IEEE Electronic Library (IEL)</source><creator>Lu, Wei ; Chen, Si-Bao ; Tang, Jin ; Ding, Chris H. Q. ; Luo, Bin</creator><creatorcontrib>Lu, Wei ; Chen, Si-Bao ; Tang, Jin ; Ding, Chris H. Q. ; Luo, Bin</creatorcontrib><description>Remote sensing (RS) images present unique challenges for computer vision due to lower resolution, smaller objects, and fewer features. Mainstream backbone networks show promising results for traditional visual tasks. However, they use convolution to reduce feature map dimensionality, which can result in information loss for small objects in RS images and decreased performance. To address this problem, we propose a new and universal downsampling module named Robust Feature Downsampling (RFD). RFD fuses multiple feature maps extracted by different downsampling techniques, creating a more robust feature map with a complementary set of features. Leveraging this, we overcome the limitations of conventional convolutional downsampling, resulting in more accurate and robust analysis of RS images. We develop two versions of RFD module, Shallow RFD (SRFD) and Deep RFD (DRFD), tailored to adapt to different stages of feature capture and improve feature robustness. We replace the downsampling layers of existing mainstream backbones with RFD module and conduct comparative experiments on several public RS image datasets. The results show significant improvements compared to baseline approaches in RS image classification, object detection, and semantic segmentation. Specifically, our RFD module achieved an average performance gain of 1.5% on NWPU-RESISC45 classification dataset without utilizing any additional pretraining data, resulting in state-of-the-art performance on this dataset. Moreover, in detection and segmentation tasks on DOTA and iSAID datasets, our RFD module outperforms the baseline approaches by 2-7% when utilizing pretraining data from NWPU-RESISC45. These results highlight the value of RFD module in enhancing the performance of RS visual tasks.</description><identifier>ISSN: 0196-2892</identifier><identifier>EISSN: 1558-0644</identifier><identifier>DOI: 10.1109/TGRS.2023.3282048</identifier><identifier>CODEN: IGRSD2</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Classification ; Computer networks ; Computer vision ; Convolution ; Datasets ; Detection ; Feature downsample ; Feature extraction ; Feature maps ; Frequency locked loops ; Image classification ; Image processing ; Image segmentation ; Modules ; Object detection ; Object recognition ; Remote sensing ; Robustness ; Segmentation ; Semantic segmentation ; Task analysis ; Transformers ; Visual tasks ; Visualization</subject><ispartof>IEEE transactions on geoscience and remote sensing, 2023-01, Vol.61, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-ae6844722441196df6c31bfc4df8d246168f43cf55fc5e180a8ae47c2cd4604c3</citedby><cites>FETCH-LOGICAL-c294t-ae6844722441196df6c31bfc4df8d246168f43cf55fc5e180a8ae47c2cd4604c3</cites><orcidid>0000-0001-8375-3590 ; 0000-0001-5948-5055 ; 0009-0004-5197-5753 ; 0000-0003-1481-0162</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10142024$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10142024$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Lu, Wei</creatorcontrib><creatorcontrib>Chen, Si-Bao</creatorcontrib><creatorcontrib>Tang, Jin</creatorcontrib><creatorcontrib>Ding, Chris H. Q.</creatorcontrib><creatorcontrib>Luo, Bin</creatorcontrib><title>A Robust Feature Downsampling Module for Remote Sensing Visual Tasks</title><title>IEEE transactions on geoscience and remote sensing</title><addtitle>TGRS</addtitle><description>Remote sensing (RS) images present unique challenges for computer vision due to lower resolution, smaller objects, and fewer features. Mainstream backbone networks show promising results for traditional visual tasks. However, they use convolution to reduce feature map dimensionality, which can result in information loss for small objects in RS images and decreased performance. To address this problem, we propose a new and universal downsampling module named Robust Feature Downsampling (RFD). RFD fuses multiple feature maps extracted by different downsampling techniques, creating a more robust feature map with a complementary set of features. Leveraging this, we overcome the limitations of conventional convolutional downsampling, resulting in more accurate and robust analysis of RS images. We develop two versions of RFD module, Shallow RFD (SRFD) and Deep RFD (DRFD), tailored to adapt to different stages of feature capture and improve feature robustness. We replace the downsampling layers of existing mainstream backbones with RFD module and conduct comparative experiments on several public RS image datasets. The results show significant improvements compared to baseline approaches in RS image classification, object detection, and semantic segmentation. Specifically, our RFD module achieved an average performance gain of 1.5% on NWPU-RESISC45 classification dataset without utilizing any additional pretraining data, resulting in state-of-the-art performance on this dataset. Moreover, in detection and segmentation tasks on DOTA and iSAID datasets, our RFD module outperforms the baseline approaches by 2-7% when utilizing pretraining data from NWPU-RESISC45. These results highlight the value of RFD module in enhancing the performance of RS visual tasks.</description><subject>Classification</subject><subject>Computer networks</subject><subject>Computer vision</subject><subject>Convolution</subject><subject>Datasets</subject><subject>Detection</subject><subject>Feature downsample</subject><subject>Feature extraction</subject><subject>Feature maps</subject><subject>Frequency locked loops</subject><subject>Image classification</subject><subject>Image processing</subject><subject>Image segmentation</subject><subject>Modules</subject><subject>Object detection</subject><subject>Object recognition</subject><subject>Remote sensing</subject><subject>Robustness</subject><subject>Segmentation</subject><subject>Semantic segmentation</subject><subject>Task analysis</subject><subject>Transformers</subject><subject>Visual tasks</subject><subject>Visualization</subject><issn>0196-2892</issn><issn>1558-0644</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkM9LwzAUx4MoOKd_gOAh4LkzSV_T9Dg2N4WJsE2vIUtfpLNrZtIi_ve2zIOnd3if7_vxIeSWswnnrHjYLtebiWAinaRCCQbqjIx4lqmESYBzMmK8kIlQhbgkVzHuGeOQ8XxE5lO69rsutnSBpu0C0rn_bqI5HOuq-aAvvuxqpM4HusaDb5FusIlD572Knanp1sTPeE0unKkj3vzVMXlbPG5nT8nqdfk8m64SKwpoE4NSAeRCAPD-nNJJm_Kds1A6VQqQXCoHqXVZ5myGXDGjDEJuhS1BMrDpmNyf5h6D_-owtnrvu9D0K3X_tIRcCsl7ip8oG3yMAZ0-hupgwo_mTA-y9CBLD7L0n6w-c3fKVIj4j-fQY5D-Al5mZNA</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Lu, Wei</creator><creator>Chen, Si-Bao</creator><creator>Tang, Jin</creator><creator>Ding, Chris H. Q.</creator><creator>Luo, Bin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0001-8375-3590</orcidid><orcidid>https://orcid.org/0000-0001-5948-5055</orcidid><orcidid>https://orcid.org/0009-0004-5197-5753</orcidid><orcidid>https://orcid.org/0000-0003-1481-0162</orcidid></search><sort><creationdate>20230101</creationdate><title>A Robust Feature Downsampling Module for Remote Sensing Visual Tasks</title><author>Lu, Wei ; Chen, Si-Bao ; Tang, Jin ; Ding, Chris H. Q. ; Luo, Bin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-ae6844722441196df6c31bfc4df8d246168f43cf55fc5e180a8ae47c2cd4604c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Classification</topic><topic>Computer networks</topic><topic>Computer vision</topic><topic>Convolution</topic><topic>Datasets</topic><topic>Detection</topic><topic>Feature downsample</topic><topic>Feature extraction</topic><topic>Feature maps</topic><topic>Frequency locked loops</topic><topic>Image classification</topic><topic>Image processing</topic><topic>Image segmentation</topic><topic>Modules</topic><topic>Object detection</topic><topic>Object recognition</topic><topic>Remote sensing</topic><topic>Robustness</topic><topic>Segmentation</topic><topic>Semantic segmentation</topic><topic>Task analysis</topic><topic>Transformers</topic><topic>Visual tasks</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lu, Wei</creatorcontrib><creatorcontrib>Chen, Si-Bao</creatorcontrib><creatorcontrib>Tang, Jin</creatorcontrib><creatorcontrib>Ding, Chris H. Q.</creatorcontrib><creatorcontrib>Luo, Bin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on geoscience and remote sensing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lu, Wei</au><au>Chen, Si-Bao</au><au>Tang, Jin</au><au>Ding, Chris H. Q.</au><au>Luo, Bin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Robust Feature Downsampling Module for Remote Sensing Visual Tasks</atitle><jtitle>IEEE transactions on geoscience and remote sensing</jtitle><stitle>TGRS</stitle><date>2023-01-01</date><risdate>2023</risdate><volume>61</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>0196-2892</issn><eissn>1558-0644</eissn><coden>IGRSD2</coden><abstract>Remote sensing (RS) images present unique challenges for computer vision due to lower resolution, smaller objects, and fewer features. Mainstream backbone networks show promising results for traditional visual tasks. However, they use convolution to reduce feature map dimensionality, which can result in information loss for small objects in RS images and decreased performance. To address this problem, we propose a new and universal downsampling module named Robust Feature Downsampling (RFD). RFD fuses multiple feature maps extracted by different downsampling techniques, creating a more robust feature map with a complementary set of features. Leveraging this, we overcome the limitations of conventional convolutional downsampling, resulting in more accurate and robust analysis of RS images. We develop two versions of RFD module, Shallow RFD (SRFD) and Deep RFD (DRFD), tailored to adapt to different stages of feature capture and improve feature robustness. We replace the downsampling layers of existing mainstream backbones with RFD module and conduct comparative experiments on several public RS image datasets. The results show significant improvements compared to baseline approaches in RS image classification, object detection, and semantic segmentation. Specifically, our RFD module achieved an average performance gain of 1.5% on NWPU-RESISC45 classification dataset without utilizing any additional pretraining data, resulting in state-of-the-art performance on this dataset. Moreover, in detection and segmentation tasks on DOTA and iSAID datasets, our RFD module outperforms the baseline approaches by 2-7% when utilizing pretraining data from NWPU-RESISC45. These results highlight the value of RFD module in enhancing the performance of RS visual tasks.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TGRS.2023.3282048</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-8375-3590</orcidid><orcidid>https://orcid.org/0000-0001-5948-5055</orcidid><orcidid>https://orcid.org/0009-0004-5197-5753</orcidid><orcidid>https://orcid.org/0000-0003-1481-0162</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0196-2892
ispartof	IEEE transactions on geoscience and remote sensing, 2023-01, Vol.61, p.1-1
issn	0196-2892 1558-0644
language	eng
recordid	cdi_ieee_primary_10142024
source	IEEE Electronic Library (IEL)
subjects	Classification Computer networks Computer vision Convolution Datasets Detection Feature downsample Feature extraction Feature maps Frequency locked loops Image classification Image processing Image segmentation Modules Object detection Object recognition Remote sensing Robustness Segmentation Semantic segmentation Task analysis Transformers Visual tasks Visualization
title	A Robust Feature Downsampling Module for Remote Sensing Visual Tasks
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T22%3A15%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Robust%20Feature%20Downsampling%20Module%20for%20Remote%20Sensing%20Visual%20Tasks&rft.jtitle=IEEE%20transactions%20on%20geoscience%20and%20remote%20sensing&rft.au=Lu,%20Wei&rft.date=2023-01-01&rft.volume=61&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=0196-2892&rft.eissn=1558-0644&rft.coden=IGRSD2&rft_id=info:doi/10.1109/TGRS.2023.3282048&rft_dat=%3Cproquest_RIE%3E2826476261%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2826476261&rft_id=info:pmid/&rft_ieee_id=10142024&rfr_iscdi=true