Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing

Transformer-based methods have gradually shown excellent performance in remote sensing (RS) image dehazing tasks. The self-attention can effectively explore nonlocal features, which are crucial for restoring images obscured by haze. However, when the tokens from the query differ from those of the ke...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE geoscience and remote sensing letters 2024, Vol.21, p.1-5
Hauptverfasser: Dong, Haobo, Song, Tianyu, Qi, Xuanyu, Jin, Guiyue, Jin, Jiyu, Ma, Ling
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 5
container_issue
container_start_page 1
container_title IEEE geoscience and remote sensing letters
container_volume 21
creator Dong, Haobo
Song, Tianyu
Qi, Xuanyu
Jin, Guiyue
Jin, Jiyu
Ma, Ling
description Transformer-based methods have gradually shown excellent performance in remote sensing (RS) image dehazing tasks. The self-attention can effectively explore nonlocal features, which are crucial for restoring images obscured by haze. However, when the tokens from the query differ from those of the key, these low-correlation self-attention values will still be included in the calculations indiscriminately, leading to further interference in the reconstruction of clear images. To better aggregate features, we propose a prompt-guided sparse Transformer (PGSformer). Specifically, adaptive top-k guided attention (ATGA) utilizes the top-k selection operator (TSO) to preserve the most important attention scores from the keys for each query, preventing interference from low-correlation query-key pairs in self-attention calculation. Meanwhile, we design the learnable prompt block (LPB) within ATGA to further enhance the accuracy of sparse selection for attention enhancement. Here, LPB guides the TSO dynamically optimizing sparse rate and adaptively learning mask thresholds to further distill the selected features. In addition, the frequency selection feedforward network (FSFN) is designed to adaptively obtain frequency information, so that the overall pipeline can improve the learning ability of dual frequency features. Extensive experimental results on several benchmarks show that our PGSformer outperforms the other competitive dehazing approach (RSDformer) by 0.92 dB on average PSNR.
doi_str_mv 10.1109/LGRS.2024.3450181
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3101349797</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10648722</ieee_id><sourcerecordid>3101349797</sourcerecordid><originalsourceid>FETCH-LOGICAL-c176t-9bb44b041df48693bfdf04035ac56867a4119746d01e35c722e1ac5864de1fa03</originalsourceid><addsrcrecordid>eNpNkE9PwzAMxSMEEmPwAZA4ROLcETdJkx5hwJhUCbQNiVuUtu7oRP-QdAf49KTaDpyeLf-ebT1CroHNAFh6ly1W61nMYjHjQjLQcEImIKWOmFRwOtZCRjLVH-fkwvsdC6TWakIe3lzX9EO02NcllnTdW-eRbpxtfdW5Bh0NQlfYdAPSNba-brd02dgt0kf8tL-hvSRnlf3yeHXUKXl_ftrMX6LsdbGc32dRASoZojTPhciZgLISOkl5XpUVE4xLW8hEJ8oKgFSJpGSAXBYqjhHCSCeiRKgs41Nye9jbu-57j34wu27v2nDScGDARapSFSg4UIXrvHdYmd7VjXU_BpgZozJjVGaMyhyjCp6bg6dGxH98InR4g_8BlIhj9g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3101349797</pqid></control><display><type>article</type><title>Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing</title><source>IEEE Xplore</source><creator>Dong, Haobo ; Song, Tianyu ; Qi, Xuanyu ; Jin, Guiyue ; Jin, Jiyu ; Ma, Ling</creator><creatorcontrib>Dong, Haobo ; Song, Tianyu ; Qi, Xuanyu ; Jin, Guiyue ; Jin, Jiyu ; Ma, Ling</creatorcontrib><description>Transformer-based methods have gradually shown excellent performance in remote sensing (RS) image dehazing tasks. The self-attention can effectively explore nonlocal features, which are crucial for restoring images obscured by haze. However, when the tokens from the query differ from those of the key, these low-correlation self-attention values will still be included in the calculations indiscriminately, leading to further interference in the reconstruction of clear images. To better aggregate features, we propose a prompt-guided sparse Transformer (PGSformer). Specifically, adaptive top-k guided attention (ATGA) utilizes the top-k selection operator (TSO) to preserve the most important attention scores from the keys for each query, preventing interference from low-correlation query-key pairs in self-attention calculation. Meanwhile, we design the learnable prompt block (LPB) within ATGA to further enhance the accuracy of sparse selection for attention enhancement. Here, LPB guides the TSO dynamically optimizing sparse rate and adaptively learning mask thresholds to further distill the selected features. In addition, the frequency selection feedforward network (FSFN) is designed to adaptively obtain frequency information, so that the overall pipeline can improve the learning ability of dual frequency features. Extensive experimental results on several benchmarks show that our PGSformer outperforms the other competitive dehazing approach (RSDformer) by 0.92 dB on average PSNR.</description><identifier>ISSN: 1545-598X</identifier><identifier>EISSN: 1558-0571</identifier><identifier>DOI: 10.1109/LGRS.2024.3450181</identifier><identifier>CODEN: IGRSBY</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Benchmarks ; Convolution ; Correlation ; Frequency ; Frequency-domain analysis ; Image reconstruction ; Image restoration ; Interference ; Learning ; prompt ; Queries ; Remote sensing ; remote sensing (RS) image dehazing ; Task analysis ; top-k selection operator (TSO) ; Transformer ; Transformers</subject><ispartof>IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c176t-9bb44b041df48693bfdf04035ac56867a4119746d01e35c722e1ac5864de1fa03</cites><orcidid>0000-0003-3607-0003 ; 0000-0002-8267-7052 ; 0000-0002-5546-2363</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10648722$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4010,27900,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10648722$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Dong, Haobo</creatorcontrib><creatorcontrib>Song, Tianyu</creatorcontrib><creatorcontrib>Qi, Xuanyu</creatorcontrib><creatorcontrib>Jin, Guiyue</creatorcontrib><creatorcontrib>Jin, Jiyu</creatorcontrib><creatorcontrib>Ma, Ling</creatorcontrib><title>Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing</title><title>IEEE geoscience and remote sensing letters</title><addtitle>LGRS</addtitle><description>Transformer-based methods have gradually shown excellent performance in remote sensing (RS) image dehazing tasks. The self-attention can effectively explore nonlocal features, which are crucial for restoring images obscured by haze. However, when the tokens from the query differ from those of the key, these low-correlation self-attention values will still be included in the calculations indiscriminately, leading to further interference in the reconstruction of clear images. To better aggregate features, we propose a prompt-guided sparse Transformer (PGSformer). Specifically, adaptive top-k guided attention (ATGA) utilizes the top-k selection operator (TSO) to preserve the most important attention scores from the keys for each query, preventing interference from low-correlation query-key pairs in self-attention calculation. Meanwhile, we design the learnable prompt block (LPB) within ATGA to further enhance the accuracy of sparse selection for attention enhancement. Here, LPB guides the TSO dynamically optimizing sparse rate and adaptively learning mask thresholds to further distill the selected features. In addition, the frequency selection feedforward network (FSFN) is designed to adaptively obtain frequency information, so that the overall pipeline can improve the learning ability of dual frequency features. Extensive experimental results on several benchmarks show that our PGSformer outperforms the other competitive dehazing approach (RSDformer) by 0.92 dB on average PSNR.</description><subject>Benchmarks</subject><subject>Convolution</subject><subject>Correlation</subject><subject>Frequency</subject><subject>Frequency-domain analysis</subject><subject>Image reconstruction</subject><subject>Image restoration</subject><subject>Interference</subject><subject>Learning</subject><subject>prompt</subject><subject>Queries</subject><subject>Remote sensing</subject><subject>remote sensing (RS) image dehazing</subject><subject>Task analysis</subject><subject>top-k selection operator (TSO)</subject><subject>Transformer</subject><subject>Transformers</subject><issn>1545-598X</issn><issn>1558-0571</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE9PwzAMxSMEEmPwAZA4ROLcETdJkx5hwJhUCbQNiVuUtu7oRP-QdAf49KTaDpyeLf-ebT1CroHNAFh6ly1W61nMYjHjQjLQcEImIKWOmFRwOtZCRjLVH-fkwvsdC6TWakIe3lzX9EO02NcllnTdW-eRbpxtfdW5Bh0NQlfYdAPSNba-brd02dgt0kf8tL-hvSRnlf3yeHXUKXl_ftrMX6LsdbGc32dRASoZojTPhciZgLISOkl5XpUVE4xLW8hEJ8oKgFSJpGSAXBYqjhHCSCeiRKgs41Nye9jbu-57j34wu27v2nDScGDARapSFSg4UIXrvHdYmd7VjXU_BpgZozJjVGaMyhyjCp6bg6dGxH98InR4g_8BlIhj9g</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Dong, Haobo</creator><creator>Song, Tianyu</creator><creator>Qi, Xuanyu</creator><creator>Jin, Guiyue</creator><creator>Jin, Jiyu</creator><creator>Ma, Ling</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TG</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>JQ2</scope><scope>KL.</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-3607-0003</orcidid><orcidid>https://orcid.org/0000-0002-8267-7052</orcidid><orcidid>https://orcid.org/0000-0002-5546-2363</orcidid></search><sort><creationdate>2024</creationdate><title>Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing</title><author>Dong, Haobo ; Song, Tianyu ; Qi, Xuanyu ; Jin, Guiyue ; Jin, Jiyu ; Ma, Ling</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c176t-9bb44b041df48693bfdf04035ac56867a4119746d01e35c722e1ac5864de1fa03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Benchmarks</topic><topic>Convolution</topic><topic>Correlation</topic><topic>Frequency</topic><topic>Frequency-domain analysis</topic><topic>Image reconstruction</topic><topic>Image restoration</topic><topic>Interference</topic><topic>Learning</topic><topic>prompt</topic><topic>Queries</topic><topic>Remote sensing</topic><topic>remote sensing (RS) image dehazing</topic><topic>Task analysis</topic><topic>top-k selection operator (TSO)</topic><topic>Transformer</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Dong, Haobo</creatorcontrib><creatorcontrib>Song, Tianyu</creatorcontrib><creatorcontrib>Qi, Xuanyu</creatorcontrib><creatorcontrib>Jin, Guiyue</creatorcontrib><creatorcontrib>Jin, Jiyu</creatorcontrib><creatorcontrib>Ma, Ling</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Meteorological &amp; Geoastrophysical Abstracts</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy &amp; Non-Living Resources</collection><collection>ProQuest Computer Science Collection</collection><collection>Meteorological &amp; Geoastrophysical Abstracts - Academic</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE geoscience and remote sensing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dong, Haobo</au><au>Song, Tianyu</au><au>Qi, Xuanyu</au><au>Jin, Guiyue</au><au>Jin, Jiyu</au><au>Ma, Ling</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing</atitle><jtitle>IEEE geoscience and remote sensing letters</jtitle><stitle>LGRS</stitle><date>2024</date><risdate>2024</risdate><volume>21</volume><spage>1</spage><epage>5</epage><pages>1-5</pages><issn>1545-598X</issn><eissn>1558-0571</eissn><coden>IGRSBY</coden><abstract>Transformer-based methods have gradually shown excellent performance in remote sensing (RS) image dehazing tasks. The self-attention can effectively explore nonlocal features, which are crucial for restoring images obscured by haze. However, when the tokens from the query differ from those of the key, these low-correlation self-attention values will still be included in the calculations indiscriminately, leading to further interference in the reconstruction of clear images. To better aggregate features, we propose a prompt-guided sparse Transformer (PGSformer). Specifically, adaptive top-k guided attention (ATGA) utilizes the top-k selection operator (TSO) to preserve the most important attention scores from the keys for each query, preventing interference from low-correlation query-key pairs in self-attention calculation. Meanwhile, we design the learnable prompt block (LPB) within ATGA to further enhance the accuracy of sparse selection for attention enhancement. Here, LPB guides the TSO dynamically optimizing sparse rate and adaptively learning mask thresholds to further distill the selected features. In addition, the frequency selection feedforward network (FSFN) is designed to adaptively obtain frequency information, so that the overall pipeline can improve the learning ability of dual frequency features. Extensive experimental results on several benchmarks show that our PGSformer outperforms the other competitive dehazing approach (RSDformer) by 0.92 dB on average PSNR.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/LGRS.2024.3450181</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0003-3607-0003</orcidid><orcidid>https://orcid.org/0000-0002-8267-7052</orcidid><orcidid>https://orcid.org/0000-0002-5546-2363</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1545-598X
ispartof IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5
issn 1545-598X
1558-0571
language eng
recordid cdi_proquest_journals_3101349797
source IEEE Xplore
subjects Benchmarks
Convolution
Correlation
Frequency
Frequency-domain analysis
Image reconstruction
Image restoration
Interference
Learning
prompt
Queries
Remote sensing
remote sensing (RS) image dehazing
Task analysis
top-k selection operator (TSO)
Transformer
Transformers
title Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T14%3A23%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Prompt-Guided%20Sparse%20Transformer%20for%20Remote%20Sensing%20Image%20Dehazing&rft.jtitle=IEEE%20geoscience%20and%20remote%20sensing%20letters&rft.au=Dong,%20Haobo&rft.date=2024&rft.volume=21&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.issn=1545-598X&rft.eissn=1558-0571&rft.coden=IGRSBY&rft_id=info:doi/10.1109/LGRS.2024.3450181&rft_dat=%3Cproquest_RIE%3E3101349797%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3101349797&rft_id=info:pmid/&rft_ieee_id=10648722&rfr_iscdi=true