Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing
Transformer-based methods have gradually shown excellent performance in remote sensing (RS) image dehazing tasks. The self-attention can effectively explore nonlocal features, which are crucial for restoring images obscured by haze. However, when the tokens from the query differ from those of the ke...
Gespeichert in:
Veröffentlicht in: | IEEE geoscience and remote sensing letters 2024, Vol.21, p.1-5 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 5 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | IEEE geoscience and remote sensing letters |
container_volume | 21 |
creator | Dong, Haobo Song, Tianyu Qi, Xuanyu Jin, Guiyue Jin, Jiyu Ma, Ling |
description | Transformer-based methods have gradually shown excellent performance in remote sensing (RS) image dehazing tasks. The self-attention can effectively explore nonlocal features, which are crucial for restoring images obscured by haze. However, when the tokens from the query differ from those of the key, these low-correlation self-attention values will still be included in the calculations indiscriminately, leading to further interference in the reconstruction of clear images. To better aggregate features, we propose a prompt-guided sparse Transformer (PGSformer). Specifically, adaptive top-k guided attention (ATGA) utilizes the top-k selection operator (TSO) to preserve the most important attention scores from the keys for each query, preventing interference from low-correlation query-key pairs in self-attention calculation. Meanwhile, we design the learnable prompt block (LPB) within ATGA to further enhance the accuracy of sparse selection for attention enhancement. Here, LPB guides the TSO dynamically optimizing sparse rate and adaptively learning mask thresholds to further distill the selected features. In addition, the frequency selection feedforward network (FSFN) is designed to adaptively obtain frequency information, so that the overall pipeline can improve the learning ability of dual frequency features. Extensive experimental results on several benchmarks show that our PGSformer outperforms the other competitive dehazing approach (RSDformer) by 0.92 dB on average PSNR. |
doi_str_mv | 10.1109/LGRS.2024.3450181 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3101349797</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10648722</ieee_id><sourcerecordid>3101349797</sourcerecordid><originalsourceid>FETCH-LOGICAL-c176t-9bb44b041df48693bfdf04035ac56867a4119746d01e35c722e1ac5864de1fa03</originalsourceid><addsrcrecordid>eNpNkE9PwzAMxSMEEmPwAZA4ROLcETdJkx5hwJhUCbQNiVuUtu7oRP-QdAf49KTaDpyeLf-ebT1CroHNAFh6ly1W61nMYjHjQjLQcEImIKWOmFRwOtZCRjLVH-fkwvsdC6TWakIe3lzX9EO02NcllnTdW-eRbpxtfdW5Bh0NQlfYdAPSNba-brd02dgt0kf8tL-hvSRnlf3yeHXUKXl_ftrMX6LsdbGc32dRASoZojTPhciZgLISOkl5XpUVE4xLW8hEJ8oKgFSJpGSAXBYqjhHCSCeiRKgs41Nye9jbu-57j34wu27v2nDScGDARapSFSg4UIXrvHdYmd7VjXU_BpgZozJjVGaMyhyjCp6bg6dGxH98InR4g_8BlIhj9g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3101349797</pqid></control><display><type>article</type><title>Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing</title><source>IEEE Xplore</source><creator>Dong, Haobo ; Song, Tianyu ; Qi, Xuanyu ; Jin, Guiyue ; Jin, Jiyu ; Ma, Ling</creator><creatorcontrib>Dong, Haobo ; Song, Tianyu ; Qi, Xuanyu ; Jin, Guiyue ; Jin, Jiyu ; Ma, Ling</creatorcontrib><description>Transformer-based methods have gradually shown excellent performance in remote sensing (RS) image dehazing tasks. The self-attention can effectively explore nonlocal features, which are crucial for restoring images obscured by haze. However, when the tokens from the query differ from those of the key, these low-correlation self-attention values will still be included in the calculations indiscriminately, leading to further interference in the reconstruction of clear images. To better aggregate features, we propose a prompt-guided sparse Transformer (PGSformer). Specifically, adaptive top-k guided attention (ATGA) utilizes the top-k selection operator (TSO) to preserve the most important attention scores from the keys for each query, preventing interference from low-correlation query-key pairs in self-attention calculation. Meanwhile, we design the learnable prompt block (LPB) within ATGA to further enhance the accuracy of sparse selection for attention enhancement. Here, LPB guides the TSO dynamically optimizing sparse rate and adaptively learning mask thresholds to further distill the selected features. In addition, the frequency selection feedforward network (FSFN) is designed to adaptively obtain frequency information, so that the overall pipeline can improve the learning ability of dual frequency features. Extensive experimental results on several benchmarks show that our PGSformer outperforms the other competitive dehazing approach (RSDformer) by 0.92 dB on average PSNR.</description><identifier>ISSN: 1545-598X</identifier><identifier>EISSN: 1558-0571</identifier><identifier>DOI: 10.1109/LGRS.2024.3450181</identifier><identifier>CODEN: IGRSBY</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Benchmarks ; Convolution ; Correlation ; Frequency ; Frequency-domain analysis ; Image reconstruction ; Image restoration ; Interference ; Learning ; prompt ; Queries ; Remote sensing ; remote sensing (RS) image dehazing ; Task analysis ; top-k selection operator (TSO) ; Transformer ; Transformers</subject><ispartof>IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c176t-9bb44b041df48693bfdf04035ac56867a4119746d01e35c722e1ac5864de1fa03</cites><orcidid>0000-0003-3607-0003 ; 0000-0002-8267-7052 ; 0000-0002-5546-2363</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10648722$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4010,27900,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10648722$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Dong, Haobo</creatorcontrib><creatorcontrib>Song, Tianyu</creatorcontrib><creatorcontrib>Qi, Xuanyu</creatorcontrib><creatorcontrib>Jin, Guiyue</creatorcontrib><creatorcontrib>Jin, Jiyu</creatorcontrib><creatorcontrib>Ma, Ling</creatorcontrib><title>Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing</title><title>IEEE geoscience and remote sensing letters</title><addtitle>LGRS</addtitle><description>Transformer-based methods have gradually shown excellent performance in remote sensing (RS) image dehazing tasks. The self-attention can effectively explore nonlocal features, which are crucial for restoring images obscured by haze. However, when the tokens from the query differ from those of the key, these low-correlation self-attention values will still be included in the calculations indiscriminately, leading to further interference in the reconstruction of clear images. To better aggregate features, we propose a prompt-guided sparse Transformer (PGSformer). Specifically, adaptive top-k guided attention (ATGA) utilizes the top-k selection operator (TSO) to preserve the most important attention scores from the keys for each query, preventing interference from low-correlation query-key pairs in self-attention calculation. Meanwhile, we design the learnable prompt block (LPB) within ATGA to further enhance the accuracy of sparse selection for attention enhancement. Here, LPB guides the TSO dynamically optimizing sparse rate and adaptively learning mask thresholds to further distill the selected features. In addition, the frequency selection feedforward network (FSFN) is designed to adaptively obtain frequency information, so that the overall pipeline can improve the learning ability of dual frequency features. Extensive experimental results on several benchmarks show that our PGSformer outperforms the other competitive dehazing approach (RSDformer) by 0.92 dB on average PSNR.</description><subject>Benchmarks</subject><subject>Convolution</subject><subject>Correlation</subject><subject>Frequency</subject><subject>Frequency-domain analysis</subject><subject>Image reconstruction</subject><subject>Image restoration</subject><subject>Interference</subject><subject>Learning</subject><subject>prompt</subject><subject>Queries</subject><subject>Remote sensing</subject><subject>remote sensing (RS) image dehazing</subject><subject>Task analysis</subject><subject>top-k selection operator (TSO)</subject><subject>Transformer</subject><subject>Transformers</subject><issn>1545-598X</issn><issn>1558-0571</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE9PwzAMxSMEEmPwAZA4ROLcETdJkx5hwJhUCbQNiVuUtu7oRP-QdAf49KTaDpyeLf-ebT1CroHNAFh6ly1W61nMYjHjQjLQcEImIKWOmFRwOtZCRjLVH-fkwvsdC6TWakIe3lzX9EO02NcllnTdW-eRbpxtfdW5Bh0NQlfYdAPSNba-brd02dgt0kf8tL-hvSRnlf3yeHXUKXl_ftrMX6LsdbGc32dRASoZojTPhciZgLISOkl5XpUVE4xLW8hEJ8oKgFSJpGSAXBYqjhHCSCeiRKgs41Nye9jbu-57j34wu27v2nDScGDARapSFSg4UIXrvHdYmd7VjXU_BpgZozJjVGaMyhyjCp6bg6dGxH98InR4g_8BlIhj9g</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Dong, Haobo</creator><creator>Song, Tianyu</creator><creator>Qi, Xuanyu</creator><creator>Jin, Guiyue</creator><creator>Jin, Jiyu</creator><creator>Ma, Ling</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TG</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>JQ2</scope><scope>KL.</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-3607-0003</orcidid><orcidid>https://orcid.org/0000-0002-8267-7052</orcidid><orcidid>https://orcid.org/0000-0002-5546-2363</orcidid></search><sort><creationdate>2024</creationdate><title>Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing</title><author>Dong, Haobo ; Song, Tianyu ; Qi, Xuanyu ; Jin, Guiyue ; Jin, Jiyu ; Ma, Ling</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c176t-9bb44b041df48693bfdf04035ac56867a4119746d01e35c722e1ac5864de1fa03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Benchmarks</topic><topic>Convolution</topic><topic>Correlation</topic><topic>Frequency</topic><topic>Frequency-domain analysis</topic><topic>Image reconstruction</topic><topic>Image restoration</topic><topic>Interference</topic><topic>Learning</topic><topic>prompt</topic><topic>Queries</topic><topic>Remote sensing</topic><topic>remote sensing (RS) image dehazing</topic><topic>Task analysis</topic><topic>top-k selection operator (TSO)</topic><topic>Transformer</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Dong, Haobo</creatorcontrib><creatorcontrib>Song, Tianyu</creatorcontrib><creatorcontrib>Qi, Xuanyu</creatorcontrib><creatorcontrib>Jin, Guiyue</creatorcontrib><creatorcontrib>Jin, Jiyu</creatorcontrib><creatorcontrib>Ma, Ling</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Meteorological & Geoastrophysical Abstracts</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources</collection><collection>ProQuest Computer Science Collection</collection><collection>Meteorological & Geoastrophysical Abstracts - Academic</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE geoscience and remote sensing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dong, Haobo</au><au>Song, Tianyu</au><au>Qi, Xuanyu</au><au>Jin, Guiyue</au><au>Jin, Jiyu</au><au>Ma, Ling</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing</atitle><jtitle>IEEE geoscience and remote sensing letters</jtitle><stitle>LGRS</stitle><date>2024</date><risdate>2024</risdate><volume>21</volume><spage>1</spage><epage>5</epage><pages>1-5</pages><issn>1545-598X</issn><eissn>1558-0571</eissn><coden>IGRSBY</coden><abstract>Transformer-based methods have gradually shown excellent performance in remote sensing (RS) image dehazing tasks. The self-attention can effectively explore nonlocal features, which are crucial for restoring images obscured by haze. However, when the tokens from the query differ from those of the key, these low-correlation self-attention values will still be included in the calculations indiscriminately, leading to further interference in the reconstruction of clear images. To better aggregate features, we propose a prompt-guided sparse Transformer (PGSformer). Specifically, adaptive top-k guided attention (ATGA) utilizes the top-k selection operator (TSO) to preserve the most important attention scores from the keys for each query, preventing interference from low-correlation query-key pairs in self-attention calculation. Meanwhile, we design the learnable prompt block (LPB) within ATGA to further enhance the accuracy of sparse selection for attention enhancement. Here, LPB guides the TSO dynamically optimizing sparse rate and adaptively learning mask thresholds to further distill the selected features. In addition, the frequency selection feedforward network (FSFN) is designed to adaptively obtain frequency information, so that the overall pipeline can improve the learning ability of dual frequency features. Extensive experimental results on several benchmarks show that our PGSformer outperforms the other competitive dehazing approach (RSDformer) by 0.92 dB on average PSNR.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/LGRS.2024.3450181</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0003-3607-0003</orcidid><orcidid>https://orcid.org/0000-0002-8267-7052</orcidid><orcidid>https://orcid.org/0000-0002-5546-2363</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1545-598X |
ispartof | IEEE geoscience and remote sensing letters, 2024, Vol.21, p.1-5 |
issn | 1545-598X 1558-0571 |
language | eng |
recordid | cdi_proquest_journals_3101349797 |
source | IEEE Xplore |
subjects | Benchmarks Convolution Correlation Frequency Frequency-domain analysis Image reconstruction Image restoration Interference Learning prompt Queries Remote sensing remote sensing (RS) image dehazing Task analysis top-k selection operator (TSO) Transformer Transformers |
title | Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T14%3A23%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Prompt-Guided%20Sparse%20Transformer%20for%20Remote%20Sensing%20Image%20Dehazing&rft.jtitle=IEEE%20geoscience%20and%20remote%20sensing%20letters&rft.au=Dong,%20Haobo&rft.date=2024&rft.volume=21&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.issn=1545-598X&rft.eissn=1558-0571&rft.coden=IGRSBY&rft_id=info:doi/10.1109/LGRS.2024.3450181&rft_dat=%3Cproquest_RIE%3E3101349797%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3101349797&rft_id=info:pmid/&rft_ieee_id=10648722&rfr_iscdi=true |