Pyramid Geometric Consistency Learning For Semantic Segmentation

highlights•We propose a supervised pyramid consistency learning framework in semantic segmentation. In the data preparation stage, it can obtain the overlap between different views. During the training process, corresponding pair and label information are used to improve the segmentation results at...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Pattern recognition 2023-01, Vol.133, p.109020, Article 109020
Hauptverfasser:	Zhang, Xian, Li, Qiang, Quan, Zhibin, Yang, Wankou
Format:	Artikel
Sprache:	eng
Schlagworte:	Consistency learning Semantic segmentation Supervised contrastive learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	highlights•We propose a supervised pyramid consistency learning framework in semantic segmentation. In the data preparation stage, it can obtain the overlap between different views. During the training process, corresponding pair and label information are used to improve the segmentation results at the same time. Method in this article does not require additional calculation and has a stable performance improvement compared to the baseline on main public datasets.•We designed CCM for supervising intermediate features while considering both the similarity of pixel-level and regional-level features. We also introduced cross-layer feature consistency learning. The experimental results show that the pyramid-like CCM can achieve better accuracy.•This paper also designs a mixed loss function optimized by labels and pseudo labels. Using the similarity between the middle layer and the output features, PGC can assist the existing semantic segmentation model to achieve better results. Semantic segmentation is a critical in vision fields. Randomly transforms each image into different augmented samples and supervise the views with transformed semantics labels. However, even if the views are expanded from the same sample, the prediction results obtained by the same network will be very different. Therefore, we argue that between the augmented samples, the transformation-equivariance and the representational consistency also need to be supervised. Motivated by this, we propose a simple cross-data augmentation for semantic segmentation, in which we also leverage the pixel-level consistency constraint learning between pairs of augmented samples. As a result, our scheme significantly can improve the performances of existing semantic segmentation models without additional computation overhead. We verified the effectiveness of this method on Deeplab V3 Plus. Experiments show that our method can achieve stable performance improvement on mainstream data sets such as Pascal VOC 2012, Camvid, Cityscapes, etc.
ISSN:	0031-3203 1873-5142
DOI:	10.1016/j.patcog.2022.109020