PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification

In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to impro...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE geoscience and remote sensing letters 2024, Vol.21, p.1-5
Hauptverfasser:	Li, Daxiang, Liu, Runyuan, Tang, Yao, Liu, Ying
Format:	Artikel
Sprache:	eng
Schlagworte:	Classification Cross layer design Feature extraction Feature maps Image classification Multilayers Position sensing Position-sensitive transformer Prototypes Remote sensing remote sensing image (RSI) classification Scene classification self-supervised learning Semantics Transformers
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	5
container_issue
container_start_page	1
container_title	IEEE geoscience and remote sensing letters
container_volume	21
creator	Li, Daxiang Liu, Runyuan Tang, Yao Liu, Ying
description	In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.
doi_str_mv	10.1109/LGRS.2024.3359415
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_LGRS_2024_3359415</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10415469</ieee_id><sourcerecordid>2924040395</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-7e34a581ab5031f25771e694818d49b17da206762b8c3f00f8b09b96856b42363</originalsourceid><addsrcrecordid>eNpNUMtKw0AUHUTBWv0AwUXA9dR5JjPuJNgaiFjaCu7CJL0pKU1GZ1Khf--k7cLVfZ0H9yB0T8mEUqKf8tliOWGEiQnnUgsqL9CISqkwkQm9HHohsdTq6xrdeL8lAalUMkL9fJnmGV5Nn6O59U3f2A4voRu6X4hSZ73HuTmAi7KuB2eq437lTOdr69qwf7dr2EVhiBbQ2h6iI73bRFlrNmGqoAtCO-N9UzeVGRxu0VVtdh7uznWMPqevq_QN5x-zLH3JccW06HECXBipqCkl4bRmMkkoxFooqtZClzRZG0biJGalqnhNSK1KoksdKxmXgvGYj9HjSffb2Z89-L7Y2r3rgmXBNBNEEK5lQNETqhq-dVAX365pjTsUlBRDuMUQbjGEW5zDDZyHE6cBgH_4cBOx5n_HonU0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2924040395</pqid></control><display><type>article</type><title>PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification</title><source>IEEE Electronic Library (IEL)</source><creator>Li, Daxiang ; Liu, Runyuan ; Tang, Yao ; Liu, Ying</creator><creatorcontrib>Li, Daxiang ; Liu, Runyuan ; Tang, Yao ; Liu, Ying</creatorcontrib><description>In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.</description><identifier>ISSN: 1545-598X</identifier><identifier>EISSN: 1558-0571</identifier><identifier>DOI: 10.1109/LGRS.2024.3359415</identifier><identifier>CODEN: IGRSBY</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Classification ; Cross layer design ; Feature extraction ; Feature maps ; Image classification ; Multilayers ; Position sensing ; Position-sensitive transformer ; Prototypes ; Remote sensing ; remote sensing image (RSI) classification ; Scene classification ; self-supervised learning ; Semantics ; Transformers</subject><ispartof>IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-7e34a581ab5031f25771e694818d49b17da206762b8c3f00f8b09b96856b42363</citedby><cites>FETCH-LOGICAL-c294t-7e34a581ab5031f25771e694818d49b17da206762b8c3f00f8b09b96856b42363</cites><orcidid>0000-0002-5766-5973</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10415469$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4009,27902,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10415469$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Li, Daxiang</creatorcontrib><creatorcontrib>Liu, Runyuan</creatorcontrib><creatorcontrib>Tang, Yao</creatorcontrib><creatorcontrib>Liu, Ying</creatorcontrib><title>PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification</title><title>IEEE geoscience and remote sensing letters</title><addtitle>LGRS</addtitle><description>In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.</description><subject>Classification</subject><subject>Cross layer design</subject><subject>Feature extraction</subject><subject>Feature maps</subject><subject>Image classification</subject><subject>Multilayers</subject><subject>Position sensing</subject><subject>Position-sensitive transformer</subject><subject>Prototypes</subject><subject>Remote sensing</subject><subject>remote sensing image (RSI) classification</subject><subject>Scene classification</subject><subject>self-supervised learning</subject><subject>Semantics</subject><subject>Transformers</subject><issn>1545-598X</issn><issn>1558-0571</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNUMtKw0AUHUTBWv0AwUXA9dR5JjPuJNgaiFjaCu7CJL0pKU1GZ1Khf--k7cLVfZ0H9yB0T8mEUqKf8tliOWGEiQnnUgsqL9CISqkwkQm9HHohsdTq6xrdeL8lAalUMkL9fJnmGV5Nn6O59U3f2A4voRu6X4hSZ73HuTmAi7KuB2eq437lTOdr69qwf7dr2EVhiBbQ2h6iI73bRFlrNmGqoAtCO-N9UzeVGRxu0VVtdh7uznWMPqevq_QN5x-zLH3JccW06HECXBipqCkl4bRmMkkoxFooqtZClzRZG0biJGalqnhNSK1KoksdKxmXgvGYj9HjSffb2Z89-L7Y2r3rgmXBNBNEEK5lQNETqhq-dVAX365pjTsUlBRDuMUQbjGEW5zDDZyHE6cBgH_4cBOx5n_HonU0</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Li, Daxiang</creator><creator>Liu, Runyuan</creator><creator>Tang, Yao</creator><creator>Liu, Ying</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TG</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>JQ2</scope><scope>KL.</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-5766-5973</orcidid></search><sort><creationdate>2024</creationdate><title>PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification</title><author>Li, Daxiang ; Liu, Runyuan ; Tang, Yao ; Liu, Ying</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-7e34a581ab5031f25771e694818d49b17da206762b8c3f00f8b09b96856b42363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Classification</topic><topic>Cross layer design</topic><topic>Feature extraction</topic><topic>Feature maps</topic><topic>Image classification</topic><topic>Multilayers</topic><topic>Position sensing</topic><topic>Position-sensitive transformer</topic><topic>Prototypes</topic><topic>Remote sensing</topic><topic>remote sensing image (RSI) classification</topic><topic>Scene classification</topic><topic>self-supervised learning</topic><topic>Semantics</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Daxiang</creatorcontrib><creatorcontrib>Liu, Runyuan</creatorcontrib><creatorcontrib>Tang, Yao</creatorcontrib><creatorcontrib>Liu, Ying</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Meteorological & Geoastrophysical Abstracts</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources</collection><collection>ProQuest Computer Science Collection</collection><collection>Meteorological & Geoastrophysical Abstracts - Academic</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE geoscience and remote sensing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Daxiang</au><au>Liu, Runyuan</au><au>Tang, Yao</au><au>Liu, Ying</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification</atitle><jtitle>IEEE geoscience and remote sensing letters</jtitle><stitle>LGRS</stitle><date>2024</date><risdate>2024</risdate><volume>21</volume><spage>1</spage><epage>5</epage><pages>1-5</pages><issn>1545-598X</issn><eissn>1558-0571</eissn><coden>IGRSBY</coden><abstract>In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/LGRS.2024.3359415</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0002-5766-5973</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1545-598X
ispartof	IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5
issn	1545-598X 1558-0571
language	eng
recordid	cdi_crossref_primary_10_1109_LGRS_2024_3359415
source	IEEE Electronic Library (IEL)
subjects	Classification Cross layer design Feature extraction Feature maps Image classification Multilayers Position sensing Position-sensitive transformer Prototypes Remote sensing remote sensing image (RSI) classification Scene classification self-supervised learning Semantics Transformers
title	PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T10%3A44%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PSCLI-TF:%20Position-Sensitive%20Cross-Layer%20Interactive%20Transformer%20Model%20for%20Remote%20Sensing%20Image%20Scene%20Classification&rft.jtitle=IEEE%20geoscience%20and%20remote%20sensing%20letters&rft.au=Li,%20Daxiang&rft.date=2024&rft.volume=21&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.issn=1545-598X&rft.eissn=1558-0571&rft.coden=IGRSBY&rft_id=info:doi/10.1109/LGRS.2024.3359415&rft_dat=%3Cproquest_RIE%3E2924040395%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2924040395&rft_id=info:pmid/&rft_ieee_id=10415469&rfr_iscdi=true