Multi3D: 3D-aware multimodal image synthesis

3D-aware image synthesis has attained high quality and robust 3D consistency. Existing 3D controllable generative models are designed to synthesize 3D-aware images through a single modality, such as 2D segmentation or sketches, but lack the ability to finely control generated content, such as textur...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computational Visual Media 2024-12, Vol.10 (6), p.1205-1217
Hauptverfasser:	Zhou, Wenyang, Yuan, Lu, Mu, Taijiang
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Computer Graphics Computer Science Controllability Datasets Editing Image enhancement Image Processing and Computer Vision Image quality Image segmentation Methods Radiation Research Article Robust control Semantics Sketches Synthesis User Interfaces and Human Computer Interaction
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	3D-aware image synthesis has attained high quality and robust 3D consistency. Existing 3D controllable generative models are designed to synthesize 3D-aware images through a single modality, such as 2D segmentation or sketches, but lack the ability to finely control generated content, such as texture and age. In pursuit of enhancing user-guided controllability, we propose Multi3D, a 3D-aware controllable image synthesis model that supports multi-modal input. Our model can govern the geometry of the generated image using a 2D label map, such as a segmentation or sketch map, while concurrently regulating the appearance of the generated image through a textual description. To demonstrate the effectiveness of our method, we have conducted experiments on multiple datasets, including CelebAMask-HQ, AFHQ-cat, and shapenet-car. Qualitative and quantitative evaluations show that our method outperforms existing state-of-the-art methods.
ISSN:	2096-0433 2096-0662
DOI:	10.1007/s41095-024-0422-4