RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration

All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded imag...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2025-02, Vol.55 (4), p.258, Article 258
Hauptverfasser: Tang, Aiqiang, Wu, Yan, Zhang, Yuwei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 4
container_start_page 258
container_title Applied intelligence (Dordrecht, Netherlands)
container_volume 55
creator Tang, Aiqiang
Wu, Yan
Zhang, Yuwei
description All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded images, achieving state-of-the-art restoration performance. However, existing methods exhibit limitations, including high computational burden and inadequate modeling of long-range dependencies. To address these issues, we propose a reasoning and action prompt-driven Mamba-based image restoration model, namely RamIR. Specifically, RamIR employs the Mamba block for long-range dependencies modeling with linear computational complexity relative to the feature map size. Inspired by Chain-of-Thought (CoT) prompting, we integrate Reasoning and Action (ReAct) prompts within the Mamba block. Hence, we utilize the capability of pretrained vision language (PVL) models to generate textual reasoning prompts describing the type and severity of degradations. Simultaneously, another output from PVL acts as action prompt representing the clean image caption. These prompts, employed in a CoT manner, enhance the network’s sensitivity to degradation and elicit targeted recovery actions tailored to different reasoning prompts. Additionally, we explore the seamless interaction between Mamba blocks and prompts, introducing a novel prompt-driven module (PDM) to facilitate prompt utilization. Extensive experimental results demonstrate the superior performance of RamIR, highlighting its advantages in terms of input scaling efficiency over existing benchmark models for all-in-one image restoration.
doi_str_mv 10.1007/s10489-024-06226-y
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3151294765</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3151294765</sourcerecordid><originalsourceid>FETCH-LOGICAL-c156t-88294da1b9cfb21abd603183b68489232cb20e8ed8e70af5b0a5231ebbdfa9223</originalsourceid><addsrcrecordid>eNotkE9LAzEQxYMoWKtfwFPAc3SS7Gaz3qT4DypiUfAWJrvZuqWb1GSL9NubWk8DM2_evPkRcsnhmgNUN4lDoWsGomCghFBsd0QmvKwkq4q6OiYTqPNIqfrzlJyltAIAKYFPyNsCh-fFLV04TMH3fknRtxSbsQ-ebmIYNuO--dOPX_QFB4u0C5Hies16z4J3tB9w6Wh0aQwR91vn5KTDdXIX_3VKPh7u32dPbP76-Dy7m7OGl2pkWou6aJHbuums4GhbBZJraZXOnwgpGivAaddqVwF2pQUsheTO2rbDWgg5JVcH35zye5vvm1XYRp9PGslLnt0rVWaVOKiaGFKKrjObmCPHneFg9ujMAZ3J6MwfOrOTvzmRYc0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3151294765</pqid></control><display><type>article</type><title>RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration</title><source>SpringerLink Journals (MCLS)</source><creator>Tang, Aiqiang ; Wu, Yan ; Zhang, Yuwei</creator><creatorcontrib>Tang, Aiqiang ; Wu, Yan ; Zhang, Yuwei</creatorcontrib><description>All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded images, achieving state-of-the-art restoration performance. However, existing methods exhibit limitations, including high computational burden and inadequate modeling of long-range dependencies. To address these issues, we propose a reasoning and action prompt-driven Mamba-based image restoration model, namely RamIR. Specifically, RamIR employs the Mamba block for long-range dependencies modeling with linear computational complexity relative to the feature map size. Inspired by Chain-of-Thought (CoT) prompting, we integrate Reasoning and Action (ReAct) prompts within the Mamba block. Hence, we utilize the capability of pretrained vision language (PVL) models to generate textual reasoning prompts describing the type and severity of degradations. Simultaneously, another output from PVL acts as action prompt representing the clean image caption. These prompts, employed in a CoT manner, enhance the network’s sensitivity to degradation and elicit targeted recovery actions tailored to different reasoning prompts. Additionally, we explore the seamless interaction between Mamba blocks and prompts, introducing a novel prompt-driven module (PDM) to facilitate prompt utilization. Extensive experimental results demonstrate the superior performance of RamIR, highlighting its advantages in terms of input scaling efficiency over existing benchmark models for all-in-one image restoration.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-024-06226-y</identifier><language>eng</language><publisher>Boston: Springer Nature B.V</publisher><subject>Computer vision ; Efficiency ; Feature maps ; Image acquisition ; Image degradation ; Image quality ; Image reconstruction ; Image restoration ; Reasoning</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2025-02, Vol.55 (4), p.258, Article 258</ispartof><rights>Copyright Springer Nature B.V. Feb 2025</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c156t-88294da1b9cfb21abd603183b68489232cb20e8ed8e70af5b0a5231ebbdfa9223</cites><orcidid>0000-0002-8874-8886</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Tang, Aiqiang</creatorcontrib><creatorcontrib>Wu, Yan</creatorcontrib><creatorcontrib>Zhang, Yuwei</creatorcontrib><title>RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration</title><title>Applied intelligence (Dordrecht, Netherlands)</title><description>All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded images, achieving state-of-the-art restoration performance. However, existing methods exhibit limitations, including high computational burden and inadequate modeling of long-range dependencies. To address these issues, we propose a reasoning and action prompt-driven Mamba-based image restoration model, namely RamIR. Specifically, RamIR employs the Mamba block for long-range dependencies modeling with linear computational complexity relative to the feature map size. Inspired by Chain-of-Thought (CoT) prompting, we integrate Reasoning and Action (ReAct) prompts within the Mamba block. Hence, we utilize the capability of pretrained vision language (PVL) models to generate textual reasoning prompts describing the type and severity of degradations. Simultaneously, another output from PVL acts as action prompt representing the clean image caption. These prompts, employed in a CoT manner, enhance the network’s sensitivity to degradation and elicit targeted recovery actions tailored to different reasoning prompts. Additionally, we explore the seamless interaction between Mamba blocks and prompts, introducing a novel prompt-driven module (PDM) to facilitate prompt utilization. Extensive experimental results demonstrate the superior performance of RamIR, highlighting its advantages in terms of input scaling efficiency over existing benchmark models for all-in-one image restoration.</description><subject>Computer vision</subject><subject>Efficiency</subject><subject>Feature maps</subject><subject>Image acquisition</subject><subject>Image degradation</subject><subject>Image quality</subject><subject>Image reconstruction</subject><subject>Image restoration</subject><subject>Reasoning</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><recordid>eNotkE9LAzEQxYMoWKtfwFPAc3SS7Gaz3qT4DypiUfAWJrvZuqWb1GSL9NubWk8DM2_evPkRcsnhmgNUN4lDoWsGomCghFBsd0QmvKwkq4q6OiYTqPNIqfrzlJyltAIAKYFPyNsCh-fFLV04TMH3fknRtxSbsQ-ebmIYNuO--dOPX_QFB4u0C5Hies16z4J3tB9w6Wh0aQwR91vn5KTDdXIX_3VKPh7u32dPbP76-Dy7m7OGl2pkWou6aJHbuums4GhbBZJraZXOnwgpGivAaddqVwF2pQUsheTO2rbDWgg5JVcH35zye5vvm1XYRp9PGslLnt0rVWaVOKiaGFKKrjObmCPHneFg9ujMAZ3J6MwfOrOTvzmRYc0</recordid><startdate>202502</startdate><enddate>202502</enddate><creator>Tang, Aiqiang</creator><creator>Wu, Yan</creator><creator>Zhang, Yuwei</creator><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-8874-8886</orcidid></search><sort><creationdate>202502</creationdate><title>RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration</title><author>Tang, Aiqiang ; Wu, Yan ; Zhang, Yuwei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c156t-88294da1b9cfb21abd603183b68489232cb20e8ed8e70af5b0a5231ebbdfa9223</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>Computer vision</topic><topic>Efficiency</topic><topic>Feature maps</topic><topic>Image acquisition</topic><topic>Image degradation</topic><topic>Image quality</topic><topic>Image reconstruction</topic><topic>Image restoration</topic><topic>Reasoning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tang, Aiqiang</creatorcontrib><creatorcontrib>Wu, Yan</creatorcontrib><creatorcontrib>Zhang, Yuwei</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tang, Aiqiang</au><au>Wu, Yan</au><au>Zhang, Yuwei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><date>2025-02</date><risdate>2025</risdate><volume>55</volume><issue>4</issue><spage>258</spage><pages>258-</pages><artnum>258</artnum><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded images, achieving state-of-the-art restoration performance. However, existing methods exhibit limitations, including high computational burden and inadequate modeling of long-range dependencies. To address these issues, we propose a reasoning and action prompt-driven Mamba-based image restoration model, namely RamIR. Specifically, RamIR employs the Mamba block for long-range dependencies modeling with linear computational complexity relative to the feature map size. Inspired by Chain-of-Thought (CoT) prompting, we integrate Reasoning and Action (ReAct) prompts within the Mamba block. Hence, we utilize the capability of pretrained vision language (PVL) models to generate textual reasoning prompts describing the type and severity of degradations. Simultaneously, another output from PVL acts as action prompt representing the clean image caption. These prompts, employed in a CoT manner, enhance the network’s sensitivity to degradation and elicit targeted recovery actions tailored to different reasoning prompts. Additionally, we explore the seamless interaction between Mamba blocks and prompts, introducing a novel prompt-driven module (PDM) to facilitate prompt utilization. Extensive experimental results demonstrate the superior performance of RamIR, highlighting its advantages in terms of input scaling efficiency over existing benchmark models for all-in-one image restoration.</abstract><cop>Boston</cop><pub>Springer Nature B.V</pub><doi>10.1007/s10489-024-06226-y</doi><orcidid>https://orcid.org/0000-0002-8874-8886</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0924-669X
ispartof Applied intelligence (Dordrecht, Netherlands), 2025-02, Vol.55 (4), p.258, Article 258
issn 0924-669X
1573-7497
language eng
recordid cdi_proquest_journals_3151294765
source SpringerLink Journals (MCLS)
subjects Computer vision
Efficiency
Feature maps
Image acquisition
Image degradation
Image quality
Image reconstruction
Image restoration
Reasoning
title RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T19%3A24%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=RamIR:%20Reasoning%20and%20action%20prompting%20with%20Mamba%20for%20all-in-one%20image%20restoration&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Tang,%20Aiqiang&rft.date=2025-02&rft.volume=55&rft.issue=4&rft.spage=258&rft.pages=258-&rft.artnum=258&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-024-06226-y&rft_dat=%3Cproquest_cross%3E3151294765%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3151294765&rft_id=info:pmid/&rfr_iscdi=true