Inductive Biased Swin-Transformer With Cyclic Regressor for Remote Sensing Scene Classification

Convolutional neural networks (CNNs) have been widely used in remote sensing scene classification. However, the long-range dependencies of local features cannot be taken into account by CNNs. By contrast, a visual transformer (ViT) is good at capturing the long-range dependencies as it considers the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE journal of selected topics in applied earth observations and remote sensing 2023-01, Vol.16, p.1-14
Hauptverfasser: Hao, Siyuan, Li, Nan, Ye, Yuanxin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 14
container_issue
container_start_page 1
container_title IEEE journal of selected topics in applied earth observations and remote sensing
container_volume 16
creator Hao, Siyuan
Li, Nan
Ye, Yuanxin
description Convolutional neural networks (CNNs) have been widely used in remote sensing scene classification. However, the long-range dependencies of local features cannot be taken into account by CNNs. By contrast, a visual transformer (ViT) is good at capturing the long-range dependencies as it considers the global relationship of local features by introducing a self-attention mechanism. Although the ViT can obtain a good result when training on large-scale datasets, e.g. , ImageNet, it is hard to be adapted to small-scale datasets ( e.g. , remote sensing image datasets). This is attributed to the fact that the ViT lacks the typical inductive bias capability. Therefore, we propose the inductive biased swin transformer with cyclic regressor used with random dense sampler (IBSwin-CR) to improve the training effect of the swin transformer on remote sensing image datasets, which builds upon three modules, i.e. , inductive biased shifted window multihead self-attention (IBSW-MSA) module, random dense sampler, and a regressor with cyclic regression loss. We obtain the inductive bias information and the long-range dependencies of the attention map by the IBSW-MSA module. Moreover, the final feature map goes through a random dense sampler, in which the additional spatial information is learned. Finally, the network is normalized by a cross-entropy loss function and a cyclic regression loss function. The proposed IBSwin-CR model is evaluated on public datasets such as NWPU-RESISC45 dataset and Aerial Image Dataset, and the experimental results show that the proposed network can achieve better performance than other classification models, especially for the case with a small number of samples.
doi_str_mv 10.1109/JSTARS.2023.3290676
format Article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_10186881</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10186881</ieee_id><doaj_id>oai_doaj_org_article_0e379ccaba11458e84eb93334adbef0b</doaj_id><sourcerecordid>2840389438</sourcerecordid><originalsourceid>FETCH-LOGICAL-c409t-da6b4437ca4c37911f064ea404d4acb8f84b3fbf89918aadd7c174e1e4859083</originalsourceid><addsrcrecordid>eNpNkU9r3DAQxUVJoZu0n6A5CHL2RmPJtnRMlqTdEijECz2KsTzaatm1Esmbkm9fbx1CT8P8ee8N_Bj7CmIJIMz1j3Zz89guS1HKpSyNqJv6A1uUUEEBlazO2AKMNAUooT6x85x3QtRlY-SC2fXQH90YXojfBszU8_ZPGIpNwiH7mA6U-K8w_uarV7cPjj_SNlHOMfFpOXWHOBJvachh2PLW0UB8tcecgw8OxxCHz-yjx32mL2_1gm3u7zar78XDz2_r1c1D4ZQwY9Fj3SklG4fKycYAeFErwunfXqHrtNeqk77z2hjQiH3fOGgUASldGaHlBVvPtn3EnX1K4YDp1UYM9t8gpq3FNAa3JytoCnAOOwRQlSatqDNSSoV9R150k9fV7PWU4vOR8mh38ZiG6XtbaiWkNkqeEuV85VLMOZF_TwVhT1DsDMWeoNg3KJPqclYFIvpPAbrWGuRfXNuJ1w</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2840389438</pqid></control><display><type>article</type><title>Inductive Biased Swin-Transformer With Cyclic Regressor for Remote Sensing Scene Classification</title><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Hao, Siyuan ; Li, Nan ; Ye, Yuanxin</creator><creatorcontrib>Hao, Siyuan ; Li, Nan ; Ye, Yuanxin</creatorcontrib><description>Convolutional neural networks (CNNs) have been widely used in remote sensing scene classification. However, the long-range dependencies of local features cannot be taken into account by CNNs. By contrast, a visual transformer (ViT) is good at capturing the long-range dependencies as it considers the global relationship of local features by introducing a self-attention mechanism. Although the ViT can obtain a good result when training on large-scale datasets, e.g. , ImageNet, it is hard to be adapted to small-scale datasets ( e.g. , remote sensing image datasets). This is attributed to the fact that the ViT lacks the typical inductive bias capability. Therefore, we propose the inductive biased swin transformer with cyclic regressor used with random dense sampler (IBSwin-CR) to improve the training effect of the swin transformer on remote sensing image datasets, which builds upon three modules, i.e. , inductive biased shifted window multihead self-attention (IBSW-MSA) module, random dense sampler, and a regressor with cyclic regression loss. We obtain the inductive bias information and the long-range dependencies of the attention map by the IBSW-MSA module. Moreover, the final feature map goes through a random dense sampler, in which the additional spatial information is learned. Finally, the network is normalized by a cross-entropy loss function and a cyclic regression loss function. The proposed IBSwin-CR model is evaluated on public datasets such as NWPU-RESISC45 dataset and Aerial Image Dataset, and the experimental results show that the proposed network can achieve better performance than other classification models, especially for the case with a small number of samples.</description><identifier>ISSN: 1939-1404</identifier><identifier>EISSN: 2151-1535</identifier><identifier>DOI: 10.1109/JSTARS.2023.3290676</identifier><identifier>CODEN: IJSTHZ</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Artificial neural networks ; Bias ; Classification ; Datasets ; Entropy ; Feature extraction ; Feature maps ; Loss function ; Modules ; Neural networks ; Remote sensing ; remote sensing image ; Samplers ; Scene classification ; Self-supervised learning ; self-supervised learning (SSL) ; Spatial data ; swin transformer ; Task analysis ; Training ; Transformers</subject><ispartof>IEEE journal of selected topics in applied earth observations and remote sensing, 2023-01, Vol.16, p.1-14</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c409t-da6b4437ca4c37911f064ea404d4acb8f84b3fbf89918aadd7c174e1e4859083</citedby><cites>FETCH-LOGICAL-c409t-da6b4437ca4c37911f064ea404d4acb8f84b3fbf89918aadd7c174e1e4859083</cites><orcidid>0000-0001-6843-6722 ; 0000-0001-8247-4207 ; 0009-0005-1952-2209</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,864,2102,27924,27925</link.rule.ids></links><search><creatorcontrib>Hao, Siyuan</creatorcontrib><creatorcontrib>Li, Nan</creatorcontrib><creatorcontrib>Ye, Yuanxin</creatorcontrib><title>Inductive Biased Swin-Transformer With Cyclic Regressor for Remote Sensing Scene Classification</title><title>IEEE journal of selected topics in applied earth observations and remote sensing</title><addtitle>JSTARS</addtitle><description>Convolutional neural networks (CNNs) have been widely used in remote sensing scene classification. However, the long-range dependencies of local features cannot be taken into account by CNNs. By contrast, a visual transformer (ViT) is good at capturing the long-range dependencies as it considers the global relationship of local features by introducing a self-attention mechanism. Although the ViT can obtain a good result when training on large-scale datasets, e.g. , ImageNet, it is hard to be adapted to small-scale datasets ( e.g. , remote sensing image datasets). This is attributed to the fact that the ViT lacks the typical inductive bias capability. Therefore, we propose the inductive biased swin transformer with cyclic regressor used with random dense sampler (IBSwin-CR) to improve the training effect of the swin transformer on remote sensing image datasets, which builds upon three modules, i.e. , inductive biased shifted window multihead self-attention (IBSW-MSA) module, random dense sampler, and a regressor with cyclic regression loss. We obtain the inductive bias information and the long-range dependencies of the attention map by the IBSW-MSA module. Moreover, the final feature map goes through a random dense sampler, in which the additional spatial information is learned. Finally, the network is normalized by a cross-entropy loss function and a cyclic regression loss function. The proposed IBSwin-CR model is evaluated on public datasets such as NWPU-RESISC45 dataset and Aerial Image Dataset, and the experimental results show that the proposed network can achieve better performance than other classification models, especially for the case with a small number of samples.</description><subject>Artificial neural networks</subject><subject>Bias</subject><subject>Classification</subject><subject>Datasets</subject><subject>Entropy</subject><subject>Feature extraction</subject><subject>Feature maps</subject><subject>Loss function</subject><subject>Modules</subject><subject>Neural networks</subject><subject>Remote sensing</subject><subject>remote sensing image</subject><subject>Samplers</subject><subject>Scene classification</subject><subject>Self-supervised learning</subject><subject>self-supervised learning (SSL)</subject><subject>Spatial data</subject><subject>swin transformer</subject><subject>Task analysis</subject><subject>Training</subject><subject>Transformers</subject><issn>1939-1404</issn><issn>2151-1535</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNkU9r3DAQxUVJoZu0n6A5CHL2RmPJtnRMlqTdEijECz2KsTzaatm1Esmbkm9fbx1CT8P8ee8N_Bj7CmIJIMz1j3Zz89guS1HKpSyNqJv6A1uUUEEBlazO2AKMNAUooT6x85x3QtRlY-SC2fXQH90YXojfBszU8_ZPGIpNwiH7mA6U-K8w_uarV7cPjj_SNlHOMfFpOXWHOBJvachh2PLW0UB8tcecgw8OxxCHz-yjx32mL2_1gm3u7zar78XDz2_r1c1D4ZQwY9Fj3SklG4fKycYAeFErwunfXqHrtNeqk77z2hjQiH3fOGgUASldGaHlBVvPtn3EnX1K4YDp1UYM9t8gpq3FNAa3JytoCnAOOwRQlSatqDNSSoV9R150k9fV7PWU4vOR8mh38ZiG6XtbaiWkNkqeEuV85VLMOZF_TwVhT1DsDMWeoNg3KJPqclYFIvpPAbrWGuRfXNuJ1w</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Hao, Siyuan</creator><creator>Li, Nan</creator><creator>Ye, Yuanxin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-6843-6722</orcidid><orcidid>https://orcid.org/0000-0001-8247-4207</orcidid><orcidid>https://orcid.org/0009-0005-1952-2209</orcidid></search><sort><creationdate>20230101</creationdate><title>Inductive Biased Swin-Transformer With Cyclic Regressor for Remote Sensing Scene Classification</title><author>Hao, Siyuan ; Li, Nan ; Ye, Yuanxin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c409t-da6b4437ca4c37911f064ea404d4acb8f84b3fbf89918aadd7c174e1e4859083</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Bias</topic><topic>Classification</topic><topic>Datasets</topic><topic>Entropy</topic><topic>Feature extraction</topic><topic>Feature maps</topic><topic>Loss function</topic><topic>Modules</topic><topic>Neural networks</topic><topic>Remote sensing</topic><topic>remote sensing image</topic><topic>Samplers</topic><topic>Scene classification</topic><topic>Self-supervised learning</topic><topic>self-supervised learning (SSL)</topic><topic>Spatial data</topic><topic>swin transformer</topic><topic>Task analysis</topic><topic>Training</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hao, Siyuan</creatorcontrib><creatorcontrib>Li, Nan</creatorcontrib><creatorcontrib>Ye, Yuanxin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy &amp; Non-Living Resources</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE journal of selected topics in applied earth observations and remote sensing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hao, Siyuan</au><au>Li, Nan</au><au>Ye, Yuanxin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Inductive Biased Swin-Transformer With Cyclic Regressor for Remote Sensing Scene Classification</atitle><jtitle>IEEE journal of selected topics in applied earth observations and remote sensing</jtitle><stitle>JSTARS</stitle><date>2023-01-01</date><risdate>2023</risdate><volume>16</volume><spage>1</spage><epage>14</epage><pages>1-14</pages><issn>1939-1404</issn><eissn>2151-1535</eissn><coden>IJSTHZ</coden><abstract>Convolutional neural networks (CNNs) have been widely used in remote sensing scene classification. However, the long-range dependencies of local features cannot be taken into account by CNNs. By contrast, a visual transformer (ViT) is good at capturing the long-range dependencies as it considers the global relationship of local features by introducing a self-attention mechanism. Although the ViT can obtain a good result when training on large-scale datasets, e.g. , ImageNet, it is hard to be adapted to small-scale datasets ( e.g. , remote sensing image datasets). This is attributed to the fact that the ViT lacks the typical inductive bias capability. Therefore, we propose the inductive biased swin transformer with cyclic regressor used with random dense sampler (IBSwin-CR) to improve the training effect of the swin transformer on remote sensing image datasets, which builds upon three modules, i.e. , inductive biased shifted window multihead self-attention (IBSW-MSA) module, random dense sampler, and a regressor with cyclic regression loss. We obtain the inductive bias information and the long-range dependencies of the attention map by the IBSW-MSA module. Moreover, the final feature map goes through a random dense sampler, in which the additional spatial information is learned. Finally, the network is normalized by a cross-entropy loss function and a cyclic regression loss function. The proposed IBSwin-CR model is evaluated on public datasets such as NWPU-RESISC45 dataset and Aerial Image Dataset, and the experimental results show that the proposed network can achieve better performance than other classification models, especially for the case with a small number of samples.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/JSTARS.2023.3290676</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0001-6843-6722</orcidid><orcidid>https://orcid.org/0000-0001-8247-4207</orcidid><orcidid>https://orcid.org/0009-0005-1952-2209</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1939-1404
ispartof IEEE journal of selected topics in applied earth observations and remote sensing, 2023-01, Vol.16, p.1-14
issn 1939-1404
2151-1535
language eng
recordid cdi_ieee_primary_10186881
source DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Artificial neural networks
Bias
Classification
Datasets
Entropy
Feature extraction
Feature maps
Loss function
Modules
Neural networks
Remote sensing
remote sensing image
Samplers
Scene classification
Self-supervised learning
self-supervised learning (SSL)
Spatial data
swin transformer
Task analysis
Training
Transformers
title Inductive Biased Swin-Transformer With Cyclic Regressor for Remote Sensing Scene Classification
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T06%3A16%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Inductive%20Biased%20Swin-Transformer%20With%20Cyclic%20Regressor%20for%20Remote%20Sensing%20Scene%20Classification&rft.jtitle=IEEE%20journal%20of%20selected%20topics%20in%20applied%20earth%20observations%20and%20remote%20sensing&rft.au=Hao,%20Siyuan&rft.date=2023-01-01&rft.volume=16&rft.spage=1&rft.epage=14&rft.pages=1-14&rft.issn=1939-1404&rft.eissn=2151-1535&rft.coden=IJSTHZ&rft_id=info:doi/10.1109/JSTARS.2023.3290676&rft_dat=%3Cproquest_ieee_%3E2840389438%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2840389438&rft_id=info:pmid/&rft_ieee_id=10186881&rft_doaj_id=oai_doaj_org_article_0e379ccaba11458e84eb93334adbef0b&rfr_iscdi=true