Frequency-Time Diffusion with Neural Cellular Automata
Despite considerable success, large Denoising Diffusion Models (DDMs) with UNet backbone pose practical challenges, particularly on limited hardware and in processing gigapixel images. To address these limitations, we introduce two Neural Cellular Automata (NCA)-based DDMs: Diff-NCA and FourierDiff-...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Despite considerable success, large Denoising Diffusion Models (DDMs) with
UNet backbone pose practical challenges, particularly on limited hardware and
in processing gigapixel images. To address these limitations, we introduce two
Neural Cellular Automata (NCA)-based DDMs: Diff-NCA and FourierDiff-NCA.
Capitalizing on the local communication capabilities of NCA, Diff-NCA
significantly reduces the parameter counts of NCA-based DDMs. Integrating
Fourier-based diffusion enables global communication early in the diffusion
process. This feature is particularly valuable in synthesizing complex images
with important global features, such as the CelebA dataset. We demonstrate that
even a 331k parameter Diff-NCA can generate 512x512 pathology slices, while
FourierDiff-NCA (1.1m parameters) reaches a three times lower FID score of
43.86, compared to the four times bigger UNet (3.94m parameters) with a score
of 128.2. Additionally, FourierDiff-NCA can perform diverse tasks such as
super-resolution, out-of-distribution image synthesis, and inpainting without
explicit training. |
---|---|
DOI: | 10.48550/arxiv.2401.06291 |