Multi‐tiling neural radiance field (NeRF)—geometric assessment on large‐scale aerial datasets

Neural radiance fields (NeRF) offer the potential to benefit 3D reconstruction tasks, including aerial photogrammetry. However, the scalability and accuracy of the inferred geometry are not well‐documented for large‐scale aerial assets. We aim to provide a thorough assessment of NeRF in 3D reconstru...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Photogrammetric record 2024-12, Vol.39 (188), p.718-740
Hauptverfasser:	Xu, Ningli, Qin, Rongjun, Huang, Debao, Remondino, Fabio
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy aerial photogrammetry Aerial photography Cameras Consumption Convergence Datasets Image reconstruction Lidar neural radiance field Photogrammetry Pipelining (computers) Qualitative analysis Radiance Random access memory Representations Sampling methods three‐dimensional reconstruction Tiling Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Neural radiance fields (NeRF) offer the potential to benefit 3D reconstruction tasks, including aerial photogrammetry. However, the scalability and accuracy of the inferred geometry are not well‐documented for large‐scale aerial assets. We aim to provide a thorough assessment of NeRF in 3D reconstruction from aerial images and compare it with three traditional multi‐view stereo (MVS) pipelines. However, typical NeRF approaches are not designed for large‐format aerial images, which result in very high memory consumption (often cost‐prohibitive) and slow convergence when directly applied to aerial assets. Despite a few NeRF variants adopting a representation tiling scheme to increase scalability, the random ray‐sampling strategy during training still hinders its general applicability for aerial assets. To perform an effective evaluation, we propose a new scheme to scale NeRF. In addition to representation tiling, we introduce a location‐specific sampling technique as well as a multi‐camera tiling (MCT) strategy to reduce memory consumption during image loading for RAM, representation training for GPU memory and increase the convergence rate within tiles. The MCT method decomposes a large‐frame image into multiple tiled images with different camera models, allowing these small‐frame images to be fed into the training process as needed for specific locations without a loss of accuracy. This enables NeRF approaches to be applied to aerial datasets on affordable computing devices, such as regular workstations. The proposed adaptation can be implemented to adapt for scaling any existing NeRF methods. Therefore, in this paper, instead of comparing accuracy performance against different NeRF variants, we implement our method based on a representative approach, Mip‐NeRF, and compare it against three traditional photogrammetric MVS pipelines on a typical aerial dataset against lidar reference data to assess NeRF's performance. Both qualitative and quantitative results suggest that the proposed NeRF approach produces better completeness and object details than traditional approaches, although as of now, it still falls short in terms of accuracy. The codes and datasets are made publicly available at https://github.com/GDAOSU/MCT_NERF. Neural radiance fields (NeRF) show promise for 3D reconstruction, including aerial photogrammetry. However, their scalability and accuracy for large‐scale aerial assets are unclear. We conduct a comprehensive evaluation, comparing NeRF wi
ISSN:	0031-868X 1477-9730
DOI:	10.1111/phor.12498