Conditional Image Synthesis with Diffusion Models: A Survey

Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Zhan, Zheyuan, Chen, Defang, Mei, Jian-Ping, Zhao, Zhenghe, Chen, Jiawei, Chen, Chun, Lyu, Siwei, Wang, Can
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Zhan, Zheyuan Chen, Defang Mei, Jian-Ping Zhao, Zhenghe Chen, Jiawei Chen, Chun Lyu, Siwei Wang, Can
description	Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, the complexity of diffusion-based modeling, the wide range of image synthesis tasks, and the diversity of conditioning mechanisms present significant challenges for researchers to keep up with rapid developments and understand the core concepts on this topic. In this survey, we categorize existing works based on how conditions are integrated into the two fundamental components of diffusion-based modeling, i.e., the denoising network and the sampling process. We specifically highlight the underlying principles, advantages, and potential challenges of various conditioning approaches in the training, re-purposing, and specialization stages to construct a desired denoising network. We also summarize six mainstream conditioning mechanisms in the essential sampling process. All discussions are centered around popular applications. Finally, we pinpoint some critical yet still open problems to be solved in the future and suggest some possible solutions. Our reviewed works are itemized at https://github.com/zju-pi/Awesome-Conditional-Diffusion-Models.
doi_str_mv	10.48550/arxiv.2409.19365
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2409_19365</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2409_19365</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2409_193653</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw1DO0NDYz5WSwds7PS8ksyczPS8xR8MxNTE9VCK7MK8lILc4sVijPLMlQcMlMSystBipQ8M1PSc0ptlJwVAguLSpLreRhYE1LzClO5YXS3Azybq4hzh66YGviC4oycxOLKuNB1sWDrTMmrAIAuZU0tQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Conditional Image Synthesis with Diffusion Models: A Survey</title><source>arXiv.org</source><creator>Zhan, Zheyuan ; Chen, Defang ; Mei, Jian-Ping ; Zhao, Zhenghe ; Chen, Jiawei ; Chen, Chun ; Lyu, Siwei ; Wang, Can</creator><creatorcontrib>Zhan, Zheyuan ; Chen, Defang ; Mei, Jian-Ping ; Zhao, Zhenghe ; Chen, Jiawei ; Chen, Chun ; Lyu, Siwei ; Wang, Can</creatorcontrib><description>Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, the complexity of diffusion-based modeling, the wide range of image synthesis tasks, and the diversity of conditioning mechanisms present significant challenges for researchers to keep up with rapid developments and understand the core concepts on this topic. In this survey, we categorize existing works based on how conditions are integrated into the two fundamental components of diffusion-based modeling, i.e., the denoising network and the sampling process. We specifically highlight the underlying principles, advantages, and potential challenges of various conditioning approaches in the training, re-purposing, and specialization stages to construct a desired denoising network. We also summarize six mainstream conditioning mechanisms in the essential sampling process. All discussions are centered around popular applications. Finally, we pinpoint some critical yet still open problems to be solved in the future and suggest some possible solutions. Our reviewed works are itemized at https://github.com/zju-pi/Awesome-Conditional-Diffusion-Models.</description><identifier>DOI: 10.48550/arxiv.2409.19365</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2024-09</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2409.19365$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2409.19365$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhan, Zheyuan</creatorcontrib><creatorcontrib>Chen, Defang</creatorcontrib><creatorcontrib>Mei, Jian-Ping</creatorcontrib><creatorcontrib>Zhao, Zhenghe</creatorcontrib><creatorcontrib>Chen, Jiawei</creatorcontrib><creatorcontrib>Chen, Chun</creatorcontrib><creatorcontrib>Lyu, Siwei</creatorcontrib><creatorcontrib>Wang, Can</creatorcontrib><title>Conditional Image Synthesis with Diffusion Models: A Survey</title><description>Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, the complexity of diffusion-based modeling, the wide range of image synthesis tasks, and the diversity of conditioning mechanisms present significant challenges for researchers to keep up with rapid developments and understand the core concepts on this topic. In this survey, we categorize existing works based on how conditions are integrated into the two fundamental components of diffusion-based modeling, i.e., the denoising network and the sampling process. We specifically highlight the underlying principles, advantages, and potential challenges of various conditioning approaches in the training, re-purposing, and specialization stages to construct a desired denoising network. We also summarize six mainstream conditioning mechanisms in the essential sampling process. All discussions are centered around popular applications. Finally, we pinpoint some critical yet still open problems to be solved in the future and suggest some possible solutions. Our reviewed works are itemized at https://github.com/zju-pi/Awesome-Conditional-Diffusion-Models.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw1DO0NDYz5WSwds7PS8ksyczPS8xR8MxNTE9VCK7MK8lILc4sVijPLMlQcMlMSystBipQ8M1PSc0ptlJwVAguLSpLreRhYE1LzClO5YXS3Azybq4hzh66YGviC4oycxOLKuNB1sWDrTMmrAIAuZU0tQ</recordid><startdate>20240928</startdate><enddate>20240928</enddate><creator>Zhan, Zheyuan</creator><creator>Chen, Defang</creator><creator>Mei, Jian-Ping</creator><creator>Zhao, Zhenghe</creator><creator>Chen, Jiawei</creator><creator>Chen, Chun</creator><creator>Lyu, Siwei</creator><creator>Wang, Can</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240928</creationdate><title>Conditional Image Synthesis with Diffusion Models: A Survey</title><author>Zhan, Zheyuan ; Chen, Defang ; Mei, Jian-Ping ; Zhao, Zhenghe ; Chen, Jiawei ; Chen, Chun ; Lyu, Siwei ; Wang, Can</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2409_193653</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhan, Zheyuan</creatorcontrib><creatorcontrib>Chen, Defang</creatorcontrib><creatorcontrib>Mei, Jian-Ping</creatorcontrib><creatorcontrib>Zhao, Zhenghe</creatorcontrib><creatorcontrib>Chen, Jiawei</creatorcontrib><creatorcontrib>Chen, Chun</creatorcontrib><creatorcontrib>Lyu, Siwei</creatorcontrib><creatorcontrib>Wang, Can</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhan, Zheyuan</au><au>Chen, Defang</au><au>Mei, Jian-Ping</au><au>Zhao, Zhenghe</au><au>Chen, Jiawei</au><au>Chen, Chun</au><au>Lyu, Siwei</au><au>Wang, Can</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Conditional Image Synthesis with Diffusion Models: A Survey</atitle><date>2024-09-28</date><risdate>2024</risdate><abstract>Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, the complexity of diffusion-based modeling, the wide range of image synthesis tasks, and the diversity of conditioning mechanisms present significant challenges for researchers to keep up with rapid developments and understand the core concepts on this topic. In this survey, we categorize existing works based on how conditions are integrated into the two fundamental components of diffusion-based modeling, i.e., the denoising network and the sampling process. We specifically highlight the underlying principles, advantages, and potential challenges of various conditioning approaches in the training, re-purposing, and specialization stages to construct a desired denoising network. We also summarize six mainstream conditioning mechanisms in the essential sampling process. All discussions are centered around popular applications. Finally, we pinpoint some critical yet still open problems to be solved in the future and suggest some possible solutions. Our reviewed works are itemized at https://github.com/zju-pi/Awesome-Conditional-Diffusion-Models.</abstract><doi>10.48550/arxiv.2409.19365</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2409.19365
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2409_19365
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition
title	Conditional Image Synthesis with Diffusion Models: A Survey
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T23%3A26%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Conditional%20Image%20Synthesis%20with%20Diffusion%20Models:%20A%20Survey&rft.au=Zhan,%20Zheyuan&rft.date=2024-09-28&rft_id=info:doi/10.48550/arxiv.2409.19365&rft_dat=%3Carxiv_GOX%3E2409_19365%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true