Conditional Image Synthesis with Diffusion Models: A Survey

Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Zhan, Zheyuan, Chen, Defang, Mei, Jian-Ping, Zhao, Zhenghe, Chen, Jiawei, Chen, Chun, Lyu, Siwei, Wang, Can
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Zhan, Zheyuan
Chen, Defang
Mei, Jian-Ping
Zhao, Zhenghe
Chen, Jiawei
Chen, Chun
Lyu, Siwei
Wang, Can
description Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, the complexity of diffusion-based modeling, the wide range of image synthesis tasks, and the diversity of conditioning mechanisms present significant challenges for researchers to keep up with rapid developments and understand the core concepts on this topic. In this survey, we categorize existing works based on how conditions are integrated into the two fundamental components of diffusion-based modeling, i.e., the denoising network and the sampling process. We specifically highlight the underlying principles, advantages, and potential challenges of various conditioning approaches in the training, re-purposing, and specialization stages to construct a desired denoising network. We also summarize six mainstream conditioning mechanisms in the essential sampling process. All discussions are centered around popular applications. Finally, we pinpoint some critical yet still open problems to be solved in the future and suggest some possible solutions. Our reviewed works are itemized at https://github.com/zju-pi/Awesome-Conditional-Diffusion-Models.
doi_str_mv 10.48550/arxiv.2409.19365
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2409_19365</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2409_19365</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2409_193653</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw1DO0NDYz5WSwds7PS8ksyczPS8xR8MxNTE9VCK7MK8lILc4sVijPLMlQcMlMSystBipQ8M1PSc0ptlJwVAguLSpLreRhYE1LzClO5YXS3Azybq4hzh66YGviC4oycxOLKuNB1sWDrTMmrAIAuZU0tQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Conditional Image Synthesis with Diffusion Models: A Survey</title><source>arXiv.org</source><creator>Zhan, Zheyuan ; Chen, Defang ; Mei, Jian-Ping ; Zhao, Zhenghe ; Chen, Jiawei ; Chen, Chun ; Lyu, Siwei ; Wang, Can</creator><creatorcontrib>Zhan, Zheyuan ; Chen, Defang ; Mei, Jian-Ping ; Zhao, Zhenghe ; Chen, Jiawei ; Chen, Chun ; Lyu, Siwei ; Wang, Can</creatorcontrib><description>Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, the complexity of diffusion-based modeling, the wide range of image synthesis tasks, and the diversity of conditioning mechanisms present significant challenges for researchers to keep up with rapid developments and understand the core concepts on this topic. In this survey, we categorize existing works based on how conditions are integrated into the two fundamental components of diffusion-based modeling, i.e., the denoising network and the sampling process. We specifically highlight the underlying principles, advantages, and potential challenges of various conditioning approaches in the training, re-purposing, and specialization stages to construct a desired denoising network. We also summarize six mainstream conditioning mechanisms in the essential sampling process. All discussions are centered around popular applications. Finally, we pinpoint some critical yet still open problems to be solved in the future and suggest some possible solutions. Our reviewed works are itemized at https://github.com/zju-pi/Awesome-Conditional-Diffusion-Models.</description><identifier>DOI: 10.48550/arxiv.2409.19365</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2024-09</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2409.19365$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2409.19365$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhan, Zheyuan</creatorcontrib><creatorcontrib>Chen, Defang</creatorcontrib><creatorcontrib>Mei, Jian-Ping</creatorcontrib><creatorcontrib>Zhao, Zhenghe</creatorcontrib><creatorcontrib>Chen, Jiawei</creatorcontrib><creatorcontrib>Chen, Chun</creatorcontrib><creatorcontrib>Lyu, Siwei</creatorcontrib><creatorcontrib>Wang, Can</creatorcontrib><title>Conditional Image Synthesis with Diffusion Models: A Survey</title><description>Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, the complexity of diffusion-based modeling, the wide range of image synthesis tasks, and the diversity of conditioning mechanisms present significant challenges for researchers to keep up with rapid developments and understand the core concepts on this topic. In this survey, we categorize existing works based on how conditions are integrated into the two fundamental components of diffusion-based modeling, i.e., the denoising network and the sampling process. We specifically highlight the underlying principles, advantages, and potential challenges of various conditioning approaches in the training, re-purposing, and specialization stages to construct a desired denoising network. We also summarize six mainstream conditioning mechanisms in the essential sampling process. All discussions are centered around popular applications. Finally, we pinpoint some critical yet still open problems to be solved in the future and suggest some possible solutions. Our reviewed works are itemized at https://github.com/zju-pi/Awesome-Conditional-Diffusion-Models.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw1DO0NDYz5WSwds7PS8ksyczPS8xR8MxNTE9VCK7MK8lILc4sVijPLMlQcMlMSystBipQ8M1PSc0ptlJwVAguLSpLreRhYE1LzClO5YXS3Azybq4hzh66YGviC4oycxOLKuNB1sWDrTMmrAIAuZU0tQ</recordid><startdate>20240928</startdate><enddate>20240928</enddate><creator>Zhan, Zheyuan</creator><creator>Chen, Defang</creator><creator>Mei, Jian-Ping</creator><creator>Zhao, Zhenghe</creator><creator>Chen, Jiawei</creator><creator>Chen, Chun</creator><creator>Lyu, Siwei</creator><creator>Wang, Can</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240928</creationdate><title>Conditional Image Synthesis with Diffusion Models: A Survey</title><author>Zhan, Zheyuan ; Chen, Defang ; Mei, Jian-Ping ; Zhao, Zhenghe ; Chen, Jiawei ; Chen, Chun ; Lyu, Siwei ; Wang, Can</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2409_193653</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhan, Zheyuan</creatorcontrib><creatorcontrib>Chen, Defang</creatorcontrib><creatorcontrib>Mei, Jian-Ping</creatorcontrib><creatorcontrib>Zhao, Zhenghe</creatorcontrib><creatorcontrib>Chen, Jiawei</creatorcontrib><creatorcontrib>Chen, Chun</creatorcontrib><creatorcontrib>Lyu, Siwei</creatorcontrib><creatorcontrib>Wang, Can</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhan, Zheyuan</au><au>Chen, Defang</au><au>Mei, Jian-Ping</au><au>Zhao, Zhenghe</au><au>Chen, Jiawei</au><au>Chen, Chun</au><au>Lyu, Siwei</au><au>Wang, Can</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Conditional Image Synthesis with Diffusion Models: A Survey</atitle><date>2024-09-28</date><risdate>2024</risdate><abstract>Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, the complexity of diffusion-based modeling, the wide range of image synthesis tasks, and the diversity of conditioning mechanisms present significant challenges for researchers to keep up with rapid developments and understand the core concepts on this topic. In this survey, we categorize existing works based on how conditions are integrated into the two fundamental components of diffusion-based modeling, i.e., the denoising network and the sampling process. We specifically highlight the underlying principles, advantages, and potential challenges of various conditioning approaches in the training, re-purposing, and specialization stages to construct a desired denoising network. We also summarize six mainstream conditioning mechanisms in the essential sampling process. All discussions are centered around popular applications. Finally, we pinpoint some critical yet still open problems to be solved in the future and suggest some possible solutions. Our reviewed works are itemized at https://github.com/zju-pi/Awesome-Conditional-Diffusion-Models.</abstract><doi>10.48550/arxiv.2409.19365</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2409.19365
ispartof
issn
language eng
recordid cdi_arxiv_primary_2409_19365
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computer Vision and Pattern Recognition
title Conditional Image Synthesis with Diffusion Models: A Survey
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T23%3A26%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Conditional%20Image%20Synthesis%20with%20Diffusion%20Models:%20A%20Survey&rft.au=Zhan,%20Zheyuan&rft.date=2024-09-28&rft_id=info:doi/10.48550/arxiv.2409.19365&rft_dat=%3Carxiv_GOX%3E2409_19365%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true