UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks
Ultra-high-resolution image generation poses great challenges, such as increased semantic planning complexity and detail synthesis difficulties, alongside substantial training resource demands. We present UltraPixel, a novel architecture utilizing cascade diffusion models to generate high-quality im...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Ultra-high-resolution image generation poses great challenges, such as
increased semantic planning complexity and detail synthesis difficulties,
alongside substantial training resource demands. We present UltraPixel, a novel
architecture utilizing cascade diffusion models to generate high-quality images
at multiple resolutions (\textit{e.g.}, 1K to 6K) within a single model, while
maintaining computational efficiency. UltraPixel leverages semantics-rich
representations of lower-resolution images in the later denoising stage to
guide the whole generation of highly detailed high-resolution images,
significantly reducing complexity. Furthermore, we introduce implicit neural
representations for continuous upsampling and scale-aware normalization layers
adaptable to various resolutions. Notably, both low- and high-resolution
processes are performed in the most compact space, sharing the majority of
parameters with less than 3$\%$ additional parameters for high-resolution
outputs, largely enhancing training and inference efficiency. Our model
achieves fast training with reduced data requirements, producing
photo-realistic high-resolution images and demonstrating state-of-the-art
performance in extensive experiments. |
---|---|
DOI: | 10.48550/arxiv.2407.02158 |