ISHS-Net: Single-View 3D Reconstruction by Fusing Features of Image and Shape Hierarchical Structures

The reconstruction of 3D shapes from a single view has been a longstanding challenge. Previous methods have primarily focused on learning either geometric features that depict overall shape contours but are insufficient for occluded regions, local features that capture details but cannot represent t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Remote sensing (Basel, Switzerland) Switzerland), 2023-12, Vol.15 (23), p.5449
Hauptverfasser: Gao, Guoqing, Yang, Liang, Zhang, Quan, Wang, Chongmin, Bao, Hua, Rao, Changhui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The reconstruction of 3D shapes from a single view has been a longstanding challenge. Previous methods have primarily focused on learning either geometric features that depict overall shape contours but are insufficient for occluded regions, local features that capture details but cannot represent the complete structure, or structural features that encode part relationships but require predefined semantics. However, the fusion of geometric, local, and structural features has been lacking, leading to inaccurate reconstruction of shapes with occlusions or novel compositions. To address this issue, we propose a two-stage approach for achieving 3D shape reconstruction. In the first stage, we encode the hierarchical structure features of the 3D shape using an encoder-decoder network. In the second stage, we enhance the hierarchical structure features by fusing them with global and point features and feed the enhanced features into a signed distance function (SDF) prediction network to obtain rough SDF values. Using the camera pose, we project arbitrary 3D points in space onto different depth feature maps of the CNN and obtain their corresponding positions. Then, we concatenate the features of these corresponding positions together to form local features. These local features are also fed into the SDF prediction network to obtain fine-grained SDF values. By fusing the two sets of SDF values, we improve the accuracy of the model and enable it to reconstruct other object types with higher quality. Comparative experiments demonstrate that the proposed method outperforms state-of-the-art approaches in terms of accuracy.
ISSN:2072-4292
2072-4292
DOI:10.3390/rs15235449