AR-NeRF: Unsupervised Learning of Depth and Defocus Effects from Natural Images with Aperture Rendering Neural Radiance Fields
Fully unsupervised 3D representation learning has gained attention owing to its advantages in data collection. A successful approach involves a viewpoint-aware approach that learns an image distribution based on generative models (e.g., generative adversarial networks (GANs)) while generating variou...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Fully unsupervised 3D representation learning has gained attention owing to
its advantages in data collection. A successful approach involves a
viewpoint-aware approach that learns an image distribution based on generative
models (e.g., generative adversarial networks (GANs)) while generating various
view images based on 3D-aware models (e.g., neural radiance fields (NeRFs)).
However, they require images with various views for training, and consequently,
their application to datasets with few or limited viewpoints remains a
challenge. As a complementary approach, an aperture rendering GAN (AR-GAN) that
employs a defocus cue was proposed. However, an AR-GAN is a CNN-based model and
represents a defocus independently from a viewpoint change despite its high
correlation, which is one of the reasons for its performance. As an alternative
to an AR-GAN, we propose an aperture rendering NeRF (AR-NeRF), which can
utilize viewpoint and defocus cues in a unified manner by representing both
factors in a common ray-tracing framework. Moreover, to learn defocus-aware and
defocus-independent representations in a disentangled manner, we propose
aperture randomized training, for which we learn to generate images while
randomizing the aperture size and latent codes independently. During our
experiments, we applied AR-NeRF to various natural image datasets, including
flower, bird, and face images, the results of which demonstrate the utility of
AR-NeRF for unsupervised learning of the depth and defocus effects. |
---|---|
DOI: | 10.48550/arxiv.2206.06100 |