PiCoCo: Pixelwise Contrast and Consistency Learning for Semisupervised Building Footprint Segmentation

Building footprint segmentation from high-resolution remote sensing (RS) images plays a vital role in urban planning, disaster response, and population density estimation. Convolutional neural networks (CNNs) have been recently used as a workhorse for effectively generating building footprints. Howe...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE journal of selected topics in applied earth observations and remote sensing 2021, Vol.14, p.10548-10559
Hauptverfasser:	Kang, Jian, Wang, Zhirui, Zhu, Ruoxin, Sun, Xian, Fernandez-Beltran, Ruben, Plaza, Antonio
Format:	Artikel
Sprache:	eng
Schlagworte:	Annotations Artificial neural networks Building footprint segmentation Buildings Computer architecture Consistency consistency learning contrastive learning Disaster management Enforcement Feature extraction Footprints Image annotation Image processing Image resolution Image segmentation Learning Methods missing labels Neural networks Pixels Population density Predictions Predictive models Remote sensing semantic segmentation Semantics Semi-supervised learning Semisupervised learning Training data Urban planning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Building footprint segmentation from high-resolution remote sensing (RS) images plays a vital role in urban planning, disaster response, and population density estimation. Convolutional neural networks (CNNs) have been recently used as a workhorse for effectively generating building footprints. However, to completely exploit the prediction power of CNNs, large-scale pixel-level annotations are required. Most state-of-the-art methods based on CNNs are focused on the design of network architectures for improving the predictions of building footprints with full annotations, while few works have been done on building footprint segmentation with limited annotations. In this article, we propose a novel semisupervised learning method for building footprint segmentation, which can effectively predict building footprints based on the network trained with few annotations (e.g., only \text{0.0324 {km}}^2 out of \text{2.25-{km}}^2 area is labeled). The proposed method is based on investigating the contrast between the building and background pixels in latent space and the consistency of predictions obtained from the CNN models when the input RS images are perturbed. Thus, we term the proposed semisupervised learning framework of building footprint segmentation as PiCoCo , which is based on the enforcement of Pi xelwise Co ntrast and Co nsistency during the learning phase. Our experiments, conducted on two benchmark building segmentation datasets, validate the effectiveness of our proposed framework as compared to several state-of-the-art building footprint extraction and semisupervised semantic segmentation methods.
ISSN:	1939-1404 2151-1535
DOI:	10.1109/JSTARS.2021.3119286