DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

Image restoration (IR) in real-world scenarios presents significant challenges due to the lack of high-capacity models and comprehensive datasets. To tackle these issues, we present a dual strategy: GenIR, an innovative data curation pipeline, and DreamClear, a cutting-edge Diffusion Transformer (Di...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-10
Hauptverfasser: Ai, Yuang, Zhou, Xiaoqiang, Huang, Huaibo, Han, Xiaotian, Chen, Zhengyu, You, Quanzeng, Yang, Hongxia
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Ai, Yuang
Zhou, Xiaoqiang
Huang, Huaibo
Han, Xiaotian
Chen, Zhengyu
You, Quanzeng
Yang, Hongxia
description Image restoration (IR) in real-world scenarios presents significant challenges due to the lack of high-capacity models and comprehensive datasets. To tackle these issues, we present a dual strategy: GenIR, an innovative data curation pipeline, and DreamClear, a cutting-edge Diffusion Transformer (DiT)-based image restoration model. GenIR, our pioneering contribution, is a dual-prompt learning pipeline that overcomes the limitations of existing datasets, which typically comprise only a few thousand images and thus offer limited generalizability for larger models. GenIR streamlines the process into three stages: image-text pair construction, dual-prompt based fine-tuning, and data generation & filtering. This approach circumvents the laborious data crawling process, ensuring copyright compliance and providing a cost-effective, privacy-safe solution for IR dataset construction. The result is a large-scale dataset of one million high-quality images. Our second contribution, DreamClear, is a DiT-based image restoration model. It utilizes the generative priors of text-to-image (T2I) diffusion models and the robust perceptual capabilities of multi-modal large language models (MLLMs) to achieve photorealistic restoration. To boost the model's adaptability to diverse real-world degradations, we introduce the Mixture of Adaptive Modulator (MoAM). It employs token-wise degradation priors to dynamically integrate various restoration experts, thereby expanding the range of degradations the model can address. Our exhaustive experiments confirm DreamClear's superior performance, underlining the efficacy of our dual strategy for real-world image restoration. Code and pre-trained models are available at: https://github.com/shallowdream204/DreamClear.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3120692430</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3120692430</sourcerecordid><originalsourceid>FETCH-proquest_journals_31206924303</originalsourceid><addsrcrecordid>eNqNjN8KgjAcRkcQJOU7DLoezE3tz60WdhcVRFfyw6ZOprNtFr19Qj1AVx-cc_gmyGOcB2QdMjZDvrUNpZTFKxZF3EO31AhoEyXAbHEmq5ok0EMh3RufBChy1Ubd8aGFSozAOm3ASd3hl3Q1Phr5hOJNzlAKnIIDKxxOhm-yQNMSlBX-b-doud9dkoz0Rj-G8Spv9GC6UeU8YDTesJBT_l_1AXBzQYY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3120692430</pqid></control><display><type>article</type><title>DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation</title><source>Free E- Journals</source><creator>Ai, Yuang ; Zhou, Xiaoqiang ; Huang, Huaibo ; Han, Xiaotian ; Chen, Zhengyu ; You, Quanzeng ; Yang, Hongxia</creator><creatorcontrib>Ai, Yuang ; Zhou, Xiaoqiang ; Huang, Huaibo ; Han, Xiaotian ; Chen, Zhengyu ; You, Quanzeng ; Yang, Hongxia</creatorcontrib><description>Image restoration (IR) in real-world scenarios presents significant challenges due to the lack of high-capacity models and comprehensive datasets. To tackle these issues, we present a dual strategy: GenIR, an innovative data curation pipeline, and DreamClear, a cutting-edge Diffusion Transformer (DiT)-based image restoration model. GenIR, our pioneering contribution, is a dual-prompt learning pipeline that overcomes the limitations of existing datasets, which typically comprise only a few thousand images and thus offer limited generalizability for larger models. GenIR streamlines the process into three stages: image-text pair construction, dual-prompt based fine-tuning, and data generation &amp; filtering. This approach circumvents the laborious data crawling process, ensuring copyright compliance and providing a cost-effective, privacy-safe solution for IR dataset construction. The result is a large-scale dataset of one million high-quality images. Our second contribution, DreamClear, is a DiT-based image restoration model. It utilizes the generative priors of text-to-image (T2I) diffusion models and the robust perceptual capabilities of multi-modal large language models (MLLMs) to achieve photorealistic restoration. To boost the model's adaptability to diverse real-world degradations, we introduce the Mixture of Adaptive Modulator (MoAM). It employs token-wise degradation priors to dynamically integrate various restoration experts, thereby expanding the range of degradations the model can address. Our exhaustive experiments confirm DreamClear's superior performance, underlining the efficacy of our dual strategy for real-world image restoration. Code and pre-trained models are available at: https://github.com/shallowdream204/DreamClear.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Datasets ; Effectiveness ; Image quality ; Image restoration ; Large language models ; Performance degradation ; Privacy</subject><ispartof>arXiv.org, 2024-10</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>778,782</link.rule.ids></links><search><creatorcontrib>Ai, Yuang</creatorcontrib><creatorcontrib>Zhou, Xiaoqiang</creatorcontrib><creatorcontrib>Huang, Huaibo</creatorcontrib><creatorcontrib>Han, Xiaotian</creatorcontrib><creatorcontrib>Chen, Zhengyu</creatorcontrib><creatorcontrib>You, Quanzeng</creatorcontrib><creatorcontrib>Yang, Hongxia</creatorcontrib><title>DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation</title><title>arXiv.org</title><description>Image restoration (IR) in real-world scenarios presents significant challenges due to the lack of high-capacity models and comprehensive datasets. To tackle these issues, we present a dual strategy: GenIR, an innovative data curation pipeline, and DreamClear, a cutting-edge Diffusion Transformer (DiT)-based image restoration model. GenIR, our pioneering contribution, is a dual-prompt learning pipeline that overcomes the limitations of existing datasets, which typically comprise only a few thousand images and thus offer limited generalizability for larger models. GenIR streamlines the process into three stages: image-text pair construction, dual-prompt based fine-tuning, and data generation &amp; filtering. This approach circumvents the laborious data crawling process, ensuring copyright compliance and providing a cost-effective, privacy-safe solution for IR dataset construction. The result is a large-scale dataset of one million high-quality images. Our second contribution, DreamClear, is a DiT-based image restoration model. It utilizes the generative priors of text-to-image (T2I) diffusion models and the robust perceptual capabilities of multi-modal large language models (MLLMs) to achieve photorealistic restoration. To boost the model's adaptability to diverse real-world degradations, we introduce the Mixture of Adaptive Modulator (MoAM). It employs token-wise degradation priors to dynamically integrate various restoration experts, thereby expanding the range of degradations the model can address. Our exhaustive experiments confirm DreamClear's superior performance, underlining the efficacy of our dual strategy for real-world image restoration. Code and pre-trained models are available at: https://github.com/shallowdream204/DreamClear.</description><subject>Datasets</subject><subject>Effectiveness</subject><subject>Image quality</subject><subject>Image restoration</subject><subject>Large language models</subject><subject>Performance degradation</subject><subject>Privacy</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjN8KgjAcRkcQJOU7DLoezE3tz60WdhcVRFfyw6ZOprNtFr19Qj1AVx-cc_gmyGOcB2QdMjZDvrUNpZTFKxZF3EO31AhoEyXAbHEmq5ok0EMh3RufBChy1Ubd8aGFSozAOm3ASd3hl3Q1Phr5hOJNzlAKnIIDKxxOhm-yQNMSlBX-b-doud9dkoz0Rj-G8Spv9GC6UeU8YDTesJBT_l_1AXBzQYY</recordid><startdate>20241029</startdate><enddate>20241029</enddate><creator>Ai, Yuang</creator><creator>Zhou, Xiaoqiang</creator><creator>Huang, Huaibo</creator><creator>Han, Xiaotian</creator><creator>Chen, Zhengyu</creator><creator>You, Quanzeng</creator><creator>Yang, Hongxia</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241029</creationdate><title>DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation</title><author>Ai, Yuang ; Zhou, Xiaoqiang ; Huang, Huaibo ; Han, Xiaotian ; Chen, Zhengyu ; You, Quanzeng ; Yang, Hongxia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31206924303</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Datasets</topic><topic>Effectiveness</topic><topic>Image quality</topic><topic>Image restoration</topic><topic>Large language models</topic><topic>Performance degradation</topic><topic>Privacy</topic><toplevel>online_resources</toplevel><creatorcontrib>Ai, Yuang</creatorcontrib><creatorcontrib>Zhou, Xiaoqiang</creatorcontrib><creatorcontrib>Huang, Huaibo</creatorcontrib><creatorcontrib>Han, Xiaotian</creatorcontrib><creatorcontrib>Chen, Zhengyu</creatorcontrib><creatorcontrib>You, Quanzeng</creatorcontrib><creatorcontrib>Yang, Hongxia</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ai, Yuang</au><au>Zhou, Xiaoqiang</au><au>Huang, Huaibo</au><au>Han, Xiaotian</au><au>Chen, Zhengyu</au><au>You, Quanzeng</au><au>Yang, Hongxia</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation</atitle><jtitle>arXiv.org</jtitle><date>2024-10-29</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Image restoration (IR) in real-world scenarios presents significant challenges due to the lack of high-capacity models and comprehensive datasets. To tackle these issues, we present a dual strategy: GenIR, an innovative data curation pipeline, and DreamClear, a cutting-edge Diffusion Transformer (DiT)-based image restoration model. GenIR, our pioneering contribution, is a dual-prompt learning pipeline that overcomes the limitations of existing datasets, which typically comprise only a few thousand images and thus offer limited generalizability for larger models. GenIR streamlines the process into three stages: image-text pair construction, dual-prompt based fine-tuning, and data generation &amp; filtering. This approach circumvents the laborious data crawling process, ensuring copyright compliance and providing a cost-effective, privacy-safe solution for IR dataset construction. The result is a large-scale dataset of one million high-quality images. Our second contribution, DreamClear, is a DiT-based image restoration model. It utilizes the generative priors of text-to-image (T2I) diffusion models and the robust perceptual capabilities of multi-modal large language models (MLLMs) to achieve photorealistic restoration. To boost the model's adaptability to diverse real-world degradations, we introduce the Mixture of Adaptive Modulator (MoAM). It employs token-wise degradation priors to dynamically integrate various restoration experts, thereby expanding the range of degradations the model can address. Our exhaustive experiments confirm DreamClear's superior performance, underlining the efficacy of our dual strategy for real-world image restoration. Code and pre-trained models are available at: https://github.com/shallowdream204/DreamClear.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-10
issn 2331-8422
language eng
recordid cdi_proquest_journals_3120692430
source Free E- Journals
subjects Datasets
Effectiveness
Image quality
Image restoration
Large language models
Performance degradation
Privacy
title DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T04%3A26%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=DreamClear:%20High-Capacity%20Real-World%20Image%20Restoration%20with%20Privacy-Safe%20Dataset%20Curation&rft.jtitle=arXiv.org&rft.au=Ai,%20Yuang&rft.date=2024-10-29&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3120692430%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3120692430&rft_id=info:pmid/&rfr_iscdi=true