Efficient image generation with Contour Wavelet Diffusion
The burgeoning field of image generation has captivated academia and industry with its potential to produce high-quality images, facilitating applications like text-to-image conversion, image translation, and recovery. These advancements have notably propelled the growth of the metaverse, where virt...
Gespeichert in:
Veröffentlicht in: | Computers & graphics 2024-11, Vol.124, p.104087, Article 104087 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The burgeoning field of image generation has captivated academia and industry with its potential to produce high-quality images, facilitating applications like text-to-image conversion, image translation, and recovery. These advancements have notably propelled the growth of the metaverse, where virtual environments constructed from generated images offer new interactive experiences, especially in conjunction with digital libraries. The technology creates detailed high-quality images, enabling immersive experiences. Despite diffusion models showing promise with superior image quality and mode coverage over GANs, their slow training and inference speeds have hindered broader adoption. To counter this, we introduce the Contour Wavelet Diffusion Model, which accelerates the process by decomposing features and employing multi-directional, anisotropic analysis. This model integrates an attention mechanism to focus on high-frequency details and a reconstruction loss function to ensure image consistency and accelerate convergence. The result is a significant reduction in training and inference times without sacrificing image quality, making diffusion models viable for large-scale applications and enhancing their practicality in the evolving digital landscape.
•The Contour Wavelet Diffusion Model speeds up the original diffusion model while ensuring the quality of generated images.•The attention module enables effective focus on high-frequency information to improve the quality of image generation.•The model incorporates a reconstruction loss function to ensure the network learns variation consistency.
[Display omitted] |
---|---|
ISSN: | 0097-8493 |
DOI: | 10.1016/j.cag.2024.104087 |