Prompt-Guided Sparse Transformer for Remote Sensing Image Dehazing

Transformer-based methods have gradually shown excellent performance in remote sensing (RS) image dehazing tasks. The self-attention can effectively explore nonlocal features, which are crucial for restoring images obscured by haze. However, when the tokens from the query differ from those of the ke...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE geoscience and remote sensing letters 2024, Vol.21, p.1-5
Hauptverfasser: Dong, Haobo, Song, Tianyu, Qi, Xuanyu, Jin, Guiyue, Jin, Jiyu, Ma, Ling
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Transformer-based methods have gradually shown excellent performance in remote sensing (RS) image dehazing tasks. The self-attention can effectively explore nonlocal features, which are crucial for restoring images obscured by haze. However, when the tokens from the query differ from those of the key, these low-correlation self-attention values will still be included in the calculations indiscriminately, leading to further interference in the reconstruction of clear images. To better aggregate features, we propose a prompt-guided sparse Transformer (PGSformer). Specifically, adaptive top-k guided attention (ATGA) utilizes the top-k selection operator (TSO) to preserve the most important attention scores from the keys for each query, preventing interference from low-correlation query-key pairs in self-attention calculation. Meanwhile, we design the learnable prompt block (LPB) within ATGA to further enhance the accuracy of sparse selection for attention enhancement. Here, LPB guides the TSO dynamically optimizing sparse rate and adaptively learning mask thresholds to further distill the selected features. In addition, the frequency selection feedforward network (FSFN) is designed to adaptively obtain frequency information, so that the overall pipeline can improve the learning ability of dual frequency features. Extensive experimental results on several benchmarks show that our PGSformer outperforms the other competitive dehazing approach (RSDformer) by 0.92 dB on average PSNR.
ISSN:1545-598X
1558-0571
DOI:10.1109/LGRS.2024.3450181