Diffusion Models in Low-Level Vision: A Survey

Deep generative models have garnered significant attention in low-level vision tasks due to their generative capabilities. Among them, diffusion model-based solutions, characterized by a forward diffusion process and a reverse denoising process, have emerged as widely acclaimed for their ability to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: He, Chunming, Shen, Yuqi, Fang, Chengyu, Xiao, Fengyang, Tang, Longxiang, Zhang, Yulun, Zuo, Wangmeng, Guo, Zhenhua, Li, Xiu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator He, Chunming
Shen, Yuqi
Fang, Chengyu
Xiao, Fengyang
Tang, Longxiang
Zhang, Yulun
Zuo, Wangmeng
Guo, Zhenhua
Li, Xiu
description Deep generative models have garnered significant attention in low-level vision tasks due to their generative capabilities. Among them, diffusion model-based solutions, characterized by a forward diffusion process and a reverse denoising process, have emerged as widely acclaimed for their ability to produce samples of superior quality and diversity. This ensures the generation of visually compelling results with intricate texture information. Despite their remarkable success, a noticeable gap exists in a comprehensive survey that amalgamates these pioneering diffusion model-based works and organizes the corresponding threads. This paper proposes the comprehensive review of diffusion model-based techniques. We present three generic diffusion modeling frameworks and explore their correlations with other deep generative models, establishing the theoretical foundation. Following this, we introduce a multi-perspective categorization of diffusion models, considering both the underlying framework and the target task. Additionally, we summarize extended diffusion models applied in other tasks, including medical, remote sensing, and video scenarios. Moreover, we provide an overview of commonly used benchmarks and evaluation metrics. We conduct a thorough evaluation, encompassing both performance and efficiency, of diffusion model-based techniques in three prominent tasks. Finally, we elucidate the limitations of current diffusion models and propose seven intriguing directions for future research. This comprehensive examination aims to facilitate a profound understanding of the landscape surrounding denoising diffusion models in the context of low-level vision tasks. A curated list of diffusion model-based techniques in over 20 low-level vision tasks can be found at https://github.com/ChunmingHe/awesome-diffusion-models-in-low-level-vision.
doi_str_mv 10.48550/arxiv.2406.11138
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2406_11138</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2406_11138</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-862867d52be2ae42c2f093611bbe5cd27cc208a34885bb5abae4e4828ef43a043</originalsourceid><addsrcrecordid>eNotzstuwjAUBFBvukCUD2CFfyDBvn7k0h2C8pCCuihiG9nJtWQpkCoRKfw9z9UsZjQ6jI2lSDUaI6auvcQ-BS1sKqVUOGDpMoZw7mJz4rumorrj8cTz5j_JqaeaH-Kj-uJz_ntue7p-so_g6o5G7xyy_ep7v9gk-c96u5jnibMZJmgBbVYZ8ASONJQQxExZKb0nU1aQlSUIdEojGu-N8_cRaQSkoJUTWg3Z5HX7BBd_bTy69lo84MUTrm4MszxM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Diffusion Models in Low-Level Vision: A Survey</title><source>arXiv.org</source><creator>He, Chunming ; Shen, Yuqi ; Fang, Chengyu ; Xiao, Fengyang ; Tang, Longxiang ; Zhang, Yulun ; Zuo, Wangmeng ; Guo, Zhenhua ; Li, Xiu</creator><creatorcontrib>He, Chunming ; Shen, Yuqi ; Fang, Chengyu ; Xiao, Fengyang ; Tang, Longxiang ; Zhang, Yulun ; Zuo, Wangmeng ; Guo, Zhenhua ; Li, Xiu</creatorcontrib><description>Deep generative models have garnered significant attention in low-level vision tasks due to their generative capabilities. Among them, diffusion model-based solutions, characterized by a forward diffusion process and a reverse denoising process, have emerged as widely acclaimed for their ability to produce samples of superior quality and diversity. This ensures the generation of visually compelling results with intricate texture information. Despite their remarkable success, a noticeable gap exists in a comprehensive survey that amalgamates these pioneering diffusion model-based works and organizes the corresponding threads. This paper proposes the comprehensive review of diffusion model-based techniques. We present three generic diffusion modeling frameworks and explore their correlations with other deep generative models, establishing the theoretical foundation. Following this, we introduce a multi-perspective categorization of diffusion models, considering both the underlying framework and the target task. Additionally, we summarize extended diffusion models applied in other tasks, including medical, remote sensing, and video scenarios. Moreover, we provide an overview of commonly used benchmarks and evaluation metrics. We conduct a thorough evaluation, encompassing both performance and efficiency, of diffusion model-based techniques in three prominent tasks. Finally, we elucidate the limitations of current diffusion models and propose seven intriguing directions for future research. This comprehensive examination aims to facilitate a profound understanding of the landscape surrounding denoising diffusion models in the context of low-level vision tasks. A curated list of diffusion model-based techniques in over 20 low-level vision tasks can be found at https://github.com/ChunmingHe/awesome-diffusion-models-in-low-level-vision.</description><identifier>DOI: 10.48550/arxiv.2406.11138</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2024-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2406.11138$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2406.11138$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>He, Chunming</creatorcontrib><creatorcontrib>Shen, Yuqi</creatorcontrib><creatorcontrib>Fang, Chengyu</creatorcontrib><creatorcontrib>Xiao, Fengyang</creatorcontrib><creatorcontrib>Tang, Longxiang</creatorcontrib><creatorcontrib>Zhang, Yulun</creatorcontrib><creatorcontrib>Zuo, Wangmeng</creatorcontrib><creatorcontrib>Guo, Zhenhua</creatorcontrib><creatorcontrib>Li, Xiu</creatorcontrib><title>Diffusion Models in Low-Level Vision: A Survey</title><description>Deep generative models have garnered significant attention in low-level vision tasks due to their generative capabilities. Among them, diffusion model-based solutions, characterized by a forward diffusion process and a reverse denoising process, have emerged as widely acclaimed for their ability to produce samples of superior quality and diversity. This ensures the generation of visually compelling results with intricate texture information. Despite their remarkable success, a noticeable gap exists in a comprehensive survey that amalgamates these pioneering diffusion model-based works and organizes the corresponding threads. This paper proposes the comprehensive review of diffusion model-based techniques. We present three generic diffusion modeling frameworks and explore their correlations with other deep generative models, establishing the theoretical foundation. Following this, we introduce a multi-perspective categorization of diffusion models, considering both the underlying framework and the target task. Additionally, we summarize extended diffusion models applied in other tasks, including medical, remote sensing, and video scenarios. Moreover, we provide an overview of commonly used benchmarks and evaluation metrics. We conduct a thorough evaluation, encompassing both performance and efficiency, of diffusion model-based techniques in three prominent tasks. Finally, we elucidate the limitations of current diffusion models and propose seven intriguing directions for future research. This comprehensive examination aims to facilitate a profound understanding of the landscape surrounding denoising diffusion models in the context of low-level vision tasks. A curated list of diffusion model-based techniques in over 20 low-level vision tasks can be found at https://github.com/ChunmingHe/awesome-diffusion-models-in-low-level-vision.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzstuwjAUBFBvukCUD2CFfyDBvn7k0h2C8pCCuihiG9nJtWQpkCoRKfw9z9UsZjQ6jI2lSDUaI6auvcQ-BS1sKqVUOGDpMoZw7mJz4rumorrj8cTz5j_JqaeaH-Kj-uJz_ntue7p-so_g6o5G7xyy_ep7v9gk-c96u5jnibMZJmgBbVYZ8ASONJQQxExZKb0nU1aQlSUIdEojGu-N8_cRaQSkoJUTWg3Z5HX7BBd_bTy69lo84MUTrm4MszxM</recordid><startdate>20240616</startdate><enddate>20240616</enddate><creator>He, Chunming</creator><creator>Shen, Yuqi</creator><creator>Fang, Chengyu</creator><creator>Xiao, Fengyang</creator><creator>Tang, Longxiang</creator><creator>Zhang, Yulun</creator><creator>Zuo, Wangmeng</creator><creator>Guo, Zhenhua</creator><creator>Li, Xiu</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240616</creationdate><title>Diffusion Models in Low-Level Vision: A Survey</title><author>He, Chunming ; Shen, Yuqi ; Fang, Chengyu ; Xiao, Fengyang ; Tang, Longxiang ; Zhang, Yulun ; Zuo, Wangmeng ; Guo, Zhenhua ; Li, Xiu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-862867d52be2ae42c2f093611bbe5cd27cc208a34885bb5abae4e4828ef43a043</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>He, Chunming</creatorcontrib><creatorcontrib>Shen, Yuqi</creatorcontrib><creatorcontrib>Fang, Chengyu</creatorcontrib><creatorcontrib>Xiao, Fengyang</creatorcontrib><creatorcontrib>Tang, Longxiang</creatorcontrib><creatorcontrib>Zhang, Yulun</creatorcontrib><creatorcontrib>Zuo, Wangmeng</creatorcontrib><creatorcontrib>Guo, Zhenhua</creatorcontrib><creatorcontrib>Li, Xiu</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>He, Chunming</au><au>Shen, Yuqi</au><au>Fang, Chengyu</au><au>Xiao, Fengyang</au><au>Tang, Longxiang</au><au>Zhang, Yulun</au><au>Zuo, Wangmeng</au><au>Guo, Zhenhua</au><au>Li, Xiu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Diffusion Models in Low-Level Vision: A Survey</atitle><date>2024-06-16</date><risdate>2024</risdate><abstract>Deep generative models have garnered significant attention in low-level vision tasks due to their generative capabilities. Among them, diffusion model-based solutions, characterized by a forward diffusion process and a reverse denoising process, have emerged as widely acclaimed for their ability to produce samples of superior quality and diversity. This ensures the generation of visually compelling results with intricate texture information. Despite their remarkable success, a noticeable gap exists in a comprehensive survey that amalgamates these pioneering diffusion model-based works and organizes the corresponding threads. This paper proposes the comprehensive review of diffusion model-based techniques. We present three generic diffusion modeling frameworks and explore their correlations with other deep generative models, establishing the theoretical foundation. Following this, we introduce a multi-perspective categorization of diffusion models, considering both the underlying framework and the target task. Additionally, we summarize extended diffusion models applied in other tasks, including medical, remote sensing, and video scenarios. Moreover, we provide an overview of commonly used benchmarks and evaluation metrics. We conduct a thorough evaluation, encompassing both performance and efficiency, of diffusion model-based techniques in three prominent tasks. Finally, we elucidate the limitations of current diffusion models and propose seven intriguing directions for future research. This comprehensive examination aims to facilitate a profound understanding of the landscape surrounding denoising diffusion models in the context of low-level vision tasks. A curated list of diffusion model-based techniques in over 20 low-level vision tasks can be found at https://github.com/ChunmingHe/awesome-diffusion-models-in-low-level-vision.</abstract><doi>10.48550/arxiv.2406.11138</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2406.11138
ispartof
issn
language eng
recordid cdi_arxiv_primary_2406_11138
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computer Vision and Pattern Recognition
title Diffusion Models in Low-Level Vision: A Survey
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T19%3A31%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Diffusion%20Models%20in%20Low-Level%20Vision:%20A%20Survey&rft.au=He,%20Chunming&rft.date=2024-06-16&rft_id=info:doi/10.48550/arxiv.2406.11138&rft_dat=%3Carxiv_GOX%3E2406_11138%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true