PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification

In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to impro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE geoscience and remote sensing letters 2024, Vol.21, p.1-5
Hauptverfasser: Li, Daxiang, Liu, Runyuan, Tang, Yao, Liu, Ying
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 5
container_issue
container_start_page 1
container_title IEEE geoscience and remote sensing letters
container_volume 21
creator Li, Daxiang
Liu, Runyuan
Tang, Yao
Liu, Ying
description In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.
doi_str_mv 10.1109/LGRS.2024.3359415
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_LGRS_2024_3359415</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10415469</ieee_id><sourcerecordid>2924040395</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-7e34a581ab5031f25771e694818d49b17da206762b8c3f00f8b09b96856b42363</originalsourceid><addsrcrecordid>eNpNUMtKw0AUHUTBWv0AwUXA9dR5JjPuJNgaiFjaCu7CJL0pKU1GZ1Khf--k7cLVfZ0H9yB0T8mEUqKf8tliOWGEiQnnUgsqL9CISqkwkQm9HHohsdTq6xrdeL8lAalUMkL9fJnmGV5Nn6O59U3f2A4voRu6X4hSZ73HuTmAi7KuB2eq437lTOdr69qwf7dr2EVhiBbQ2h6iI73bRFlrNmGqoAtCO-N9UzeVGRxu0VVtdh7uznWMPqevq_QN5x-zLH3JccW06HECXBipqCkl4bRmMkkoxFooqtZClzRZG0biJGalqnhNSK1KoksdKxmXgvGYj9HjSffb2Z89-L7Y2r3rgmXBNBNEEK5lQNETqhq-dVAX365pjTsUlBRDuMUQbjGEW5zDDZyHE6cBgH_4cBOx5n_HonU0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2924040395</pqid></control><display><type>article</type><title>PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification</title><source>IEEE Electronic Library (IEL)</source><creator>Li, Daxiang ; Liu, Runyuan ; Tang, Yao ; Liu, Ying</creator><creatorcontrib>Li, Daxiang ; Liu, Runyuan ; Tang, Yao ; Liu, Ying</creatorcontrib><description>In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.</description><identifier>ISSN: 1545-598X</identifier><identifier>EISSN: 1558-0571</identifier><identifier>DOI: 10.1109/LGRS.2024.3359415</identifier><identifier>CODEN: IGRSBY</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Classification ; Cross layer design ; Feature extraction ; Feature maps ; Image classification ; Multilayers ; Position sensing ; Position-sensitive transformer ; Prototypes ; Remote sensing ; remote sensing image (RSI) classification ; Scene classification ; self-supervised learning ; Semantics ; Transformers</subject><ispartof>IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-7e34a581ab5031f25771e694818d49b17da206762b8c3f00f8b09b96856b42363</citedby><cites>FETCH-LOGICAL-c294t-7e34a581ab5031f25771e694818d49b17da206762b8c3f00f8b09b96856b42363</cites><orcidid>0000-0002-5766-5973</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10415469$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4009,27902,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10415469$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Li, Daxiang</creatorcontrib><creatorcontrib>Liu, Runyuan</creatorcontrib><creatorcontrib>Tang, Yao</creatorcontrib><creatorcontrib>Liu, Ying</creatorcontrib><title>PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification</title><title>IEEE geoscience and remote sensing letters</title><addtitle>LGRS</addtitle><description>In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.</description><subject>Classification</subject><subject>Cross layer design</subject><subject>Feature extraction</subject><subject>Feature maps</subject><subject>Image classification</subject><subject>Multilayers</subject><subject>Position sensing</subject><subject>Position-sensitive transformer</subject><subject>Prototypes</subject><subject>Remote sensing</subject><subject>remote sensing image (RSI) classification</subject><subject>Scene classification</subject><subject>self-supervised learning</subject><subject>Semantics</subject><subject>Transformers</subject><issn>1545-598X</issn><issn>1558-0571</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNUMtKw0AUHUTBWv0AwUXA9dR5JjPuJNgaiFjaCu7CJL0pKU1GZ1Khf--k7cLVfZ0H9yB0T8mEUqKf8tliOWGEiQnnUgsqL9CISqkwkQm9HHohsdTq6xrdeL8lAalUMkL9fJnmGV5Nn6O59U3f2A4voRu6X4hSZ73HuTmAi7KuB2eq437lTOdr69qwf7dr2EVhiBbQ2h6iI73bRFlrNmGqoAtCO-N9UzeVGRxu0VVtdh7uznWMPqevq_QN5x-zLH3JccW06HECXBipqCkl4bRmMkkoxFooqtZClzRZG0biJGalqnhNSK1KoksdKxmXgvGYj9HjSffb2Z89-L7Y2r3rgmXBNBNEEK5lQNETqhq-dVAX365pjTsUlBRDuMUQbjGEW5zDDZyHE6cBgH_4cBOx5n_HonU0</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Li, Daxiang</creator><creator>Liu, Runyuan</creator><creator>Tang, Yao</creator><creator>Liu, Ying</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TG</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>JQ2</scope><scope>KL.</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-5766-5973</orcidid></search><sort><creationdate>2024</creationdate><title>PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification</title><author>Li, Daxiang ; Liu, Runyuan ; Tang, Yao ; Liu, Ying</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-7e34a581ab5031f25771e694818d49b17da206762b8c3f00f8b09b96856b42363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Classification</topic><topic>Cross layer design</topic><topic>Feature extraction</topic><topic>Feature maps</topic><topic>Image classification</topic><topic>Multilayers</topic><topic>Position sensing</topic><topic>Position-sensitive transformer</topic><topic>Prototypes</topic><topic>Remote sensing</topic><topic>remote sensing image (RSI) classification</topic><topic>Scene classification</topic><topic>self-supervised learning</topic><topic>Semantics</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Daxiang</creatorcontrib><creatorcontrib>Liu, Runyuan</creatorcontrib><creatorcontrib>Tang, Yao</creatorcontrib><creatorcontrib>Liu, Ying</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Meteorological &amp; Geoastrophysical Abstracts</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy &amp; Non-Living Resources</collection><collection>ProQuest Computer Science Collection</collection><collection>Meteorological &amp; Geoastrophysical Abstracts - Academic</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE geoscience and remote sensing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Daxiang</au><au>Liu, Runyuan</au><au>Tang, Yao</au><au>Liu, Ying</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification</atitle><jtitle>IEEE geoscience and remote sensing letters</jtitle><stitle>LGRS</stitle><date>2024</date><risdate>2024</risdate><volume>21</volume><spage>1</spage><epage>5</epage><pages>1-5</pages><issn>1545-598X</issn><eissn>1558-0571</eissn><coden>IGRSBY</coden><abstract>In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/LGRS.2024.3359415</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0002-5766-5973</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1545-598X
ispartof IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5
issn 1545-598X
1558-0571
language eng
recordid cdi_crossref_primary_10_1109_LGRS_2024_3359415
source IEEE Electronic Library (IEL)
subjects Classification
Cross layer design
Feature extraction
Feature maps
Image classification
Multilayers
Position sensing
Position-sensitive transformer
Prototypes
Remote sensing
remote sensing image (RSI) classification
Scene classification
self-supervised learning
Semantics
Transformers
title PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T10%3A44%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PSCLI-TF:%20Position-Sensitive%20Cross-Layer%20Interactive%20Transformer%20Model%20for%20Remote%20Sensing%20Image%20Scene%20Classification&rft.jtitle=IEEE%20geoscience%20and%20remote%20sensing%20letters&rft.au=Li,%20Daxiang&rft.date=2024&rft.volume=21&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.issn=1545-598X&rft.eissn=1558-0571&rft.coden=IGRSBY&rft_id=info:doi/10.1109/LGRS.2024.3359415&rft_dat=%3Cproquest_RIE%3E2924040395%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2924040395&rft_id=info:pmid/&rft_ieee_id=10415469&rfr_iscdi=true