Scale-Robust Deep-Supervision Network for Mapping Building Footprints From High-Resolution Remote Sensing Images

Building footprint information is one of the key factors for sustainable urban planning and environmental monitoring. Mapping building footprints from remote sensing images is an important and challenging task in the earth observation field. Over the years, convolutional neural networks have shown o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE journal of selected topics in applied earth observations and remote sensing 2021, Vol.14, p.10091-10100
Hauptverfasser: Guo, Haonan, Su, Xin, Tang, Shengkun, Du, Bo, Zhang, Liangpei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Building footprint information is one of the key factors for sustainable urban planning and environmental monitoring. Mapping building footprints from remote sensing images is an important and challenging task in the earth observation field. Over the years, convolutional neural networks have shown outstanding improvements in the building extraction field due to their ability to automatically extract hierarchical features and make building predictions. However, as buildings are various in different sizes, scenes, and roofing materials, it is hard to precisely depict buildings of varied sizes, especially in large areas (e.g., nationwide). To tackle these limitations, we propose a novel deep-supervision convolutional neural network (denoted as DS-Net) for extracting building footprints from high-resolution remote sensing images. In the proposed network, we applied deep supervision with an extra lightweight encoder, which enables the network to learn representative building features of different scales. Furthermore, a scale attention module is designed to aggregate multiscale features and generate the final building prediction. Experiments on two publicly available building datasets, including the WHU Building Dataset and the Massachusetts Building Dataset, show the effectiveness of the proposed method. With only a 0.22-M increment of parameters compared with U-Net, the proposed DS-Net achieved an IoU of 90.4% on the WHU Building Dataset and 73.8% on the Massachusetts Dataset. DS-Net also outperforms the state-of-the-art building extraction methods on the two datasets, indicating the effectiveness of the proposed deep supervision and scale attention.
ISSN:1939-1404
2151-1535
DOI:10.1109/JSTARS.2021.3109237