PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification
In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to impro...
Gespeichert in:
Veröffentlicht in: | IEEE geoscience and remote sensing letters 2024, Vol.21, p.1-5 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 5 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | IEEE geoscience and remote sensing letters |
container_volume | 21 |
creator | Li, Daxiang Liu, Runyuan Tang, Yao Liu, Ying |
description | In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods. |
doi_str_mv | 10.1109/LGRS.2024.3359415 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_LGRS_2024_3359415</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10415469</ieee_id><sourcerecordid>2924040395</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-7e34a581ab5031f25771e694818d49b17da206762b8c3f00f8b09b96856b42363</originalsourceid><addsrcrecordid>eNpNUMtKw0AUHUTBWv0AwUXA9dR5JjPuJNgaiFjaCu7CJL0pKU1GZ1Khf--k7cLVfZ0H9yB0T8mEUqKf8tliOWGEiQnnUgsqL9CISqkwkQm9HHohsdTq6xrdeL8lAalUMkL9fJnmGV5Nn6O59U3f2A4voRu6X4hSZ73HuTmAi7KuB2eq437lTOdr69qwf7dr2EVhiBbQ2h6iI73bRFlrNmGqoAtCO-N9UzeVGRxu0VVtdh7uznWMPqevq_QN5x-zLH3JccW06HECXBipqCkl4bRmMkkoxFooqtZClzRZG0biJGalqnhNSK1KoksdKxmXgvGYj9HjSffb2Z89-L7Y2r3rgmXBNBNEEK5lQNETqhq-dVAX365pjTsUlBRDuMUQbjGEW5zDDZyHE6cBgH_4cBOx5n_HonU0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2924040395</pqid></control><display><type>article</type><title>PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification</title><source>IEEE Electronic Library (IEL)</source><creator>Li, Daxiang ; Liu, Runyuan ; Tang, Yao ; Liu, Ying</creator><creatorcontrib>Li, Daxiang ; Liu, Runyuan ; Tang, Yao ; Liu, Ying</creatorcontrib><description>In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.</description><identifier>ISSN: 1545-598X</identifier><identifier>EISSN: 1558-0571</identifier><identifier>DOI: 10.1109/LGRS.2024.3359415</identifier><identifier>CODEN: IGRSBY</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Classification ; Cross layer design ; Feature extraction ; Feature maps ; Image classification ; Multilayers ; Position sensing ; Position-sensitive transformer ; Prototypes ; Remote sensing ; remote sensing image (RSI) classification ; Scene classification ; self-supervised learning ; Semantics ; Transformers</subject><ispartof>IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-7e34a581ab5031f25771e694818d49b17da206762b8c3f00f8b09b96856b42363</citedby><cites>FETCH-LOGICAL-c294t-7e34a581ab5031f25771e694818d49b17da206762b8c3f00f8b09b96856b42363</cites><orcidid>0000-0002-5766-5973</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10415469$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4009,27902,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10415469$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Li, Daxiang</creatorcontrib><creatorcontrib>Liu, Runyuan</creatorcontrib><creatorcontrib>Tang, Yao</creatorcontrib><creatorcontrib>Liu, Ying</creatorcontrib><title>PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification</title><title>IEEE geoscience and remote sensing letters</title><addtitle>LGRS</addtitle><description>In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.</description><subject>Classification</subject><subject>Cross layer design</subject><subject>Feature extraction</subject><subject>Feature maps</subject><subject>Image classification</subject><subject>Multilayers</subject><subject>Position sensing</subject><subject>Position-sensitive transformer</subject><subject>Prototypes</subject><subject>Remote sensing</subject><subject>remote sensing image (RSI) classification</subject><subject>Scene classification</subject><subject>self-supervised learning</subject><subject>Semantics</subject><subject>Transformers</subject><issn>1545-598X</issn><issn>1558-0571</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNUMtKw0AUHUTBWv0AwUXA9dR5JjPuJNgaiFjaCu7CJL0pKU1GZ1Khf--k7cLVfZ0H9yB0T8mEUqKf8tliOWGEiQnnUgsqL9CISqkwkQm9HHohsdTq6xrdeL8lAalUMkL9fJnmGV5Nn6O59U3f2A4voRu6X4hSZ73HuTmAi7KuB2eq437lTOdr69qwf7dr2EVhiBbQ2h6iI73bRFlrNmGqoAtCO-N9UzeVGRxu0VVtdh7uznWMPqevq_QN5x-zLH3JccW06HECXBipqCkl4bRmMkkoxFooqtZClzRZG0biJGalqnhNSK1KoksdKxmXgvGYj9HjSffb2Z89-L7Y2r3rgmXBNBNEEK5lQNETqhq-dVAX365pjTsUlBRDuMUQbjGEW5zDDZyHE6cBgH_4cBOx5n_HonU0</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Li, Daxiang</creator><creator>Liu, Runyuan</creator><creator>Tang, Yao</creator><creator>Liu, Ying</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TG</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>JQ2</scope><scope>KL.</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-5766-5973</orcidid></search><sort><creationdate>2024</creationdate><title>PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification</title><author>Li, Daxiang ; Liu, Runyuan ; Tang, Yao ; Liu, Ying</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-7e34a581ab5031f25771e694818d49b17da206762b8c3f00f8b09b96856b42363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Classification</topic><topic>Cross layer design</topic><topic>Feature extraction</topic><topic>Feature maps</topic><topic>Image classification</topic><topic>Multilayers</topic><topic>Position sensing</topic><topic>Position-sensitive transformer</topic><topic>Prototypes</topic><topic>Remote sensing</topic><topic>remote sensing image (RSI) classification</topic><topic>Scene classification</topic><topic>self-supervised learning</topic><topic>Semantics</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Daxiang</creatorcontrib><creatorcontrib>Liu, Runyuan</creatorcontrib><creatorcontrib>Tang, Yao</creatorcontrib><creatorcontrib>Liu, Ying</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Meteorological & Geoastrophysical Abstracts</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources</collection><collection>ProQuest Computer Science Collection</collection><collection>Meteorological & Geoastrophysical Abstracts - Academic</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE geoscience and remote sensing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Daxiang</au><au>Liu, Runyuan</au><au>Tang, Yao</au><au>Liu, Ying</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification</atitle><jtitle>IEEE geoscience and remote sensing letters</jtitle><stitle>LGRS</stitle><date>2024</date><risdate>2024</risdate><volume>21</volume><spage>1</spage><epage>5</epage><pages>1-5</pages><issn>1545-598X</issn><eissn>1558-0571</eissn><coden>IGRSBY</coden><abstract>In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/LGRS.2024.3359415</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0002-5766-5973</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1545-598X |
ispartof | IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5 |
issn | 1545-598X 1558-0571 |
language | eng |
recordid | cdi_crossref_primary_10_1109_LGRS_2024_3359415 |
source | IEEE Electronic Library (IEL) |
subjects | Classification Cross layer design Feature extraction Feature maps Image classification Multilayers Position sensing Position-sensitive transformer Prototypes Remote sensing remote sensing image (RSI) classification Scene classification self-supervised learning Semantics Transformers |
title | PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T10%3A44%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PSCLI-TF:%20Position-Sensitive%20Cross-Layer%20Interactive%20Transformer%20Model%20for%20Remote%20Sensing%20Image%20Scene%20Classification&rft.jtitle=IEEE%20geoscience%20and%20remote%20sensing%20letters&rft.au=Li,%20Daxiang&rft.date=2024&rft.volume=21&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.issn=1545-598X&rft.eissn=1558-0571&rft.coden=IGRSBY&rft_id=info:doi/10.1109/LGRS.2024.3359415&rft_dat=%3Cproquest_RIE%3E2924040395%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2924040395&rft_id=info:pmid/&rft_ieee_id=10415469&rfr_iscdi=true |