Gaussian Shell Maps for Efficient 3D Human Generation
Efficient generation of 3D digital humans is important in several industries, including virtual reality, social media, and cinematic production. 3D generative adversarial networks (GANs) have demonstrated state-of-the-art (SOTA) quality and diversity for generated assets. Current 3D GAN architecture...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Efficient generation of 3D digital humans is important in several industries,
including virtual reality, social media, and cinematic production. 3D
generative adversarial networks (GANs) have demonstrated state-of-the-art
(SOTA) quality and diversity for generated assets. Current 3D GAN
architectures, however, typically rely on volume representations, which are
slow to render, thereby hampering the GAN training and requiring
multi-view-inconsistent 2D upsamplers. Here, we introduce Gaussian Shell Maps
(GSMs) as a framework that connects SOTA generator network architectures with
emerging 3D Gaussian rendering primitives using an articulable multi
shell--based scaffold. In this setting, a CNN generates a 3D texture stack with
features that are mapped to the shells. The latter represent inflated and
deflated versions of a template surface of a digital human in a canonical body
pose. Instead of rasterizing the shells directly, we sample 3D Gaussians on the
shells whose attributes are encoded in the texture features. These Gaussians
are efficiently and differentiably rendered. The ability to articulate the
shells is important during GAN training and, at inference time, to deform a
body into arbitrary user-defined poses. Our efficient rendering scheme bypasses
the need for view-inconsistent upsamplers and achieves high-quality multi-view
consistent renderings at a native resolution of $512 \times 512$ pixels. We
demonstrate that GSMs successfully generate 3D humans when trained on
single-view datasets, including SHHQ and DeepFashion. |
---|---|
DOI: | 10.48550/arxiv.2311.17857 |