Neural 3D Scene Reconstruction with the Manhattan-world Assumption
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images. Many previous works have shown impressive reconstruction results on textured objects, but they still have difficulty in handling low-textured planar regions, which are common in indoor scenes. An approach t...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper addresses the challenge of reconstructing 3D indoor scenes from
multi-view images. Many previous works have shown impressive reconstruction
results on textured objects, but they still have difficulty in handling
low-textured planar regions, which are common in indoor scenes. An approach to
solving this issue is to incorporate planer constraints into the depth map
estimation in multi-view stereo-based methods, but the per-view plane
estimation and depth optimization lack both efficiency and multi-view
consistency. In this work, we show that the planar constraints can be
conveniently integrated into the recent implicit neural representation-based
reconstruction methods. Specifically, we use an MLP network to represent the
signed distance function as the scene geometry. Based on the Manhattan-world
assumption, planar constraints are employed to regularize the geometry in floor
and wall regions predicted by a 2D semantic segmentation network. To resolve
the inaccurate segmentation, we encode the semantics of 3D points with another
MLP and design a novel loss that jointly optimizes the scene geometry and
semantics in 3D space. Experiments on ScanNet and 7-Scenes datasets show that
the proposed method outperforms previous methods by a large margin on 3D
reconstruction quality. The code is available at
https://zju3dv.github.io/manhattan_sdf. |
---|---|
DOI: | 10.48550/arxiv.2205.02836 |