Strong and Weak Prompt Engineering for Remote Sensing Image-Text Cross-Modal Retrieval
modal retrieval is vital at the intersection of vision and language. Specifically, remote sensing image-text retrieval enhances our understanding of complex remote sensing content by combining multi-perspective visual information with concise textual descriptions and has increasingly become a hotspo...
Gespeichert in:
Veröffentlicht in: | IEEE journal of selected topics in applied earth observations and remote sensing 2025-01, p.1-12 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | modal retrieval is vital at the intersection of vision and language. Specifically, remote sensing image-text retrieval enhances our understanding of complex remote sensing content by combining multi-perspective visual information with concise textual descriptions and has increasingly become a hotspot for research. Existing prompts typically emphasize either global or local information, which fails to excavate or fully leverage the effective information of cross-modal data, resulting in the subpar performance of retrieval models. To address these limitations, we propose a novel method called Strong and Weak Prompt Engineering (SWPE) for remote sensing image-text retrieval. Specifically, SWPE employs the Strong and Weak Prompt Generation (SWPG) module to generate fine-grained and global category semantic prompts via an attention mechanism and a pretrained classification model. The Prompt-guided Feature Finetuning (PFF) module then refines the prompt information using a Transformer architecture, integrating the refined prompts with high-level image and text features to enhance both fine-grained details and global semantics. Finally, the Adaptive Hard Sample Elimination (AHSE) module optimizes the triplet loss function by training the model with negative sample pairs of varying difficulty, assigning higher weights to simpler pairs. Extensive quantitative and qualitative experiments on four remote sensing benchmarks validate the superior effectiveness of SWPE. |
---|---|
ISSN: | 1939-1404 2151-1535 |
DOI: | 10.1109/JSTARS.2025.3534474 |