PointNest: Learning Deep Multiscale Nested Feature Propagation for Semantic Segmentation of 3-D Point Clouds

3-D point cloud semantic segmentation is a fundamental task for scene understanding, but this task remains challenging due to the diverse scene classes, data defects, and occlusions. Most existing deep learning-based methods focus on new designs of feature extraction operators but neglect the import...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE journal of selected topics in applied earth observations and remote sensing 2023, Vol.16, p.9051-9066
Hauptverfasser:	Wan, Jie, Zeng, Ziyin, Qiu, Qinjun, Xie, Zhong, Xu, Yongyang
Format:	Artikel
Sprache:	eng
Schlagworte:	3-D point cloud Aggregation Benchmarks Coders Decoding Deep learning deep supervision (DS) Defects Feature extraction Image segmentation multiscale feature propagation Network architecture Point cloud compression Propagation Scene analysis Semantic segmentation Semantics Three dimensional models Three-dimensional displays
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	3-D point cloud semantic segmentation is a fundamental task for scene understanding, but this task remains challenging due to the diverse scene classes, data defects, and occlusions. Most existing deep learning-based methods focus on new designs of feature extraction operators but neglect the importance of exploiting multiscale point information in the network, which is crucial for identifying objects under complex scenes. To tackle this limitation, we propose an innovative network called PointNest that efficiently learns multiscale point feature propagation for accurate point segmentation. PointNest employs a deep nested U-shape encoder-decoder architecture, where the encoder learns multiscale point features through nested feature aggregation units at different network depths and propagates local geometric contextual information with skip connections along horizontal and vertical directions. The decoder then receives multiscale nested features from the encoder to progressively recover geometric details of the abstracted decoding point features for pointwise semantic prediction. In addition, we introduce a deep supervision strategy to further promote multiscale information propagation in the network for efficient training and performance improvement. Experiments on three public benchmarks demonstrate that PointNest outperforms existing mainstream methods with the mean intersection over union scores of 68.8%, 74.7%, and 62.7% in S3DIS, Toronto-3D, and WHU-MLS datasets, respectively.
ISSN:	1939-1404 2151-1535
DOI:	10.1109/JSTARS.2023.3315557