Weakly-Supervised Semantic Segmentation in Aerial Imagery via explicit Pixel-Level Constraints
In recent years, image-level weakly supervised semantic segmentation (WSSS) has developed rapidly in natural scenes due to the easy availability of classification tags. However, limited to complex backgrounds, multi-category scenes, and dense small targets in remote sensing (RS) images, relatively l...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on geoscience and remote sensing 2022, Vol.60, p.1-1 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In recent years, image-level weakly supervised semantic segmentation (WSSS) has developed rapidly in natural scenes due to the easy availability of classification tags. However, limited to complex backgrounds, multi-category scenes, and dense small targets in remote sensing (RS) images, relatively little research has been conducted in this field. To alleviate the impact of the above problems in RS scenes, a self-supervised Siamese network based on an explicit pixel-level constraints framework is proposed, which greatly improves the quality of class activation maps and the positioning accuracy in multi-category RS scenes. Specifically, there are three novel devices in this paper to promote performance to a new level: (a) A pixel-soft classification loss is proposed, which realizes explicit constraints on pixels during the image-level training; (b) A pixel global awareness module, which captures high-level semantic context and low-level pixel spatial information, is constructed to improve the consistency and accuracy of RS object segmentation; (c) A dynamic multi-scale fusion module with a gating mechanism is devised, which enhances feature representation and improves the positioning accuracy of RS objects, particularly on small and dense objects. Experiments on two RS challenge datasets demonstrate that these proposed modules achieve new state-of-the-art results by only using image-level labels, which improve mIoU to 36.79% on iSAID and 45.43% on ISPRS in the WSSS task. To the best of our knowledge, this is the first work to perform image-level WSSS on multi-class RS scenes. |
---|---|
ISSN: | 0196-2892 1558-0644 |
DOI: | 10.1109/TGRS.2022.3224477 |