RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration

All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded imag...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied intelligence (Dordrecht, Netherlands) Netherlands), 2025-02, Vol.55 (4), p.258, Article 258
Hauptverfasser:	Tang, Aiqiang, Wu, Yan, Zhang, Yuwei
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer vision Efficiency Feature maps Image acquisition Image degradation Image quality Image reconstruction Image restoration Reasoning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	4
container_start_page	258
container_title	Applied intelligence (Dordrecht, Netherlands)
container_volume	55
creator	Tang, Aiqiang Wu, Yan Zhang, Yuwei
description	All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded images, achieving state-of-the-art restoration performance. However, existing methods exhibit limitations, including high computational burden and inadequate modeling of long-range dependencies. To address these issues, we propose a reasoning and action prompt-driven Mamba-based image restoration model, namely RamIR. Specifically, RamIR employs the Mamba block for long-range dependencies modeling with linear computational complexity relative to the feature map size. Inspired by Chain-of-Thought (CoT) prompting, we integrate Reasoning and Action (ReAct) prompts within the Mamba block. Hence, we utilize the capability of pretrained vision language (PVL) models to generate textual reasoning prompts describing the type and severity of degradations. Simultaneously, another output from PVL acts as action prompt representing the clean image caption. These prompts, employed in a CoT manner, enhance the network’s sensitivity to degradation and elicit targeted recovery actions tailored to different reasoning prompts. Additionally, we explore the seamless interaction between Mamba blocks and prompts, introducing a novel prompt-driven module (PDM) to facilitate prompt utilization. Extensive experimental results demonstrate the superior performance of RamIR, highlighting its advantages in terms of input scaling efficiency over existing benchmark models for all-in-one image restoration.
doi_str_mv	10.1007/s10489-024-06226-y
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3151294765</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3151294765</sourcerecordid><originalsourceid>FETCH-LOGICAL-c156t-88294da1b9cfb21abd603183b68489232cb20e8ed8e70af5b0a5231ebbdfa9223</originalsourceid><addsrcrecordid>eNotkE9LAzEQxYMoWKtfwFPAc3SS7Gaz3qT4DypiUfAWJrvZuqWb1GSL9NubWk8DM2_evPkRcsnhmgNUN4lDoWsGomCghFBsd0QmvKwkq4q6OiYTqPNIqfrzlJyltAIAKYFPyNsCh-fFLV04TMH3fknRtxSbsQ-ebmIYNuO--dOPX_QFB4u0C5Hies16z4J3tB9w6Wh0aQwR91vn5KTDdXIX_3VKPh7u32dPbP76-Dy7m7OGl2pkWou6aJHbuums4GhbBZJraZXOnwgpGivAaddqVwF2pQUsheTO2rbDWgg5JVcH35zye5vvm1XYRp9PGslLnt0rVWaVOKiaGFKKrjObmCPHneFg9ujMAZ3J6MwfOrOTvzmRYc0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3151294765</pqid></control><display><type>article</type><title>RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration</title><source>SpringerLink Journals (MCLS)</source><creator>Tang, Aiqiang ; Wu, Yan ; Zhang, Yuwei</creator><creatorcontrib>Tang, Aiqiang ; Wu, Yan ; Zhang, Yuwei</creatorcontrib><description>All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded images, achieving state-of-the-art restoration performance. However, existing methods exhibit limitations, including high computational burden and inadequate modeling of long-range dependencies. To address these issues, we propose a reasoning and action prompt-driven Mamba-based image restoration model, namely RamIR. Specifically, RamIR employs the Mamba block for long-range dependencies modeling with linear computational complexity relative to the feature map size. Inspired by Chain-of-Thought (CoT) prompting, we integrate Reasoning and Action (ReAct) prompts within the Mamba block. Hence, we utilize the capability of pretrained vision language (PVL) models to generate textual reasoning prompts describing the type and severity of degradations. Simultaneously, another output from PVL acts as action prompt representing the clean image caption. These prompts, employed in a CoT manner, enhance the network’s sensitivity to degradation and elicit targeted recovery actions tailored to different reasoning prompts. Additionally, we explore the seamless interaction between Mamba blocks and prompts, introducing a novel prompt-driven module (PDM) to facilitate prompt utilization. Extensive experimental results demonstrate the superior performance of RamIR, highlighting its advantages in terms of input scaling efficiency over existing benchmark models for all-in-one image restoration.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-024-06226-y</identifier><language>eng</language><publisher>Boston: Springer Nature B.V</publisher><subject>Computer vision ; Efficiency ; Feature maps ; Image acquisition ; Image degradation ; Image quality ; Image reconstruction ; Image restoration ; Reasoning</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2025-02, Vol.55 (4), p.258, Article 258</ispartof><rights>Copyright Springer Nature B.V. Feb 2025</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c156t-88294da1b9cfb21abd603183b68489232cb20e8ed8e70af5b0a5231ebbdfa9223</cites><orcidid>0000-0002-8874-8886</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Tang, Aiqiang</creatorcontrib><creatorcontrib>Wu, Yan</creatorcontrib><creatorcontrib>Zhang, Yuwei</creatorcontrib><title>RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration</title><title>Applied intelligence (Dordrecht, Netherlands)</title><description>All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded images, achieving state-of-the-art restoration performance. However, existing methods exhibit limitations, including high computational burden and inadequate modeling of long-range dependencies. To address these issues, we propose a reasoning and action prompt-driven Mamba-based image restoration model, namely RamIR. Specifically, RamIR employs the Mamba block for long-range dependencies modeling with linear computational complexity relative to the feature map size. Inspired by Chain-of-Thought (CoT) prompting, we integrate Reasoning and Action (ReAct) prompts within the Mamba block. Hence, we utilize the capability of pretrained vision language (PVL) models to generate textual reasoning prompts describing the type and severity of degradations. Simultaneously, another output from PVL acts as action prompt representing the clean image caption. These prompts, employed in a CoT manner, enhance the network’s sensitivity to degradation and elicit targeted recovery actions tailored to different reasoning prompts. Additionally, we explore the seamless interaction between Mamba blocks and prompts, introducing a novel prompt-driven module (PDM) to facilitate prompt utilization. Extensive experimental results demonstrate the superior performance of RamIR, highlighting its advantages in terms of input scaling efficiency over existing benchmark models for all-in-one image restoration.</description><subject>Computer vision</subject><subject>Efficiency</subject><subject>Feature maps</subject><subject>Image acquisition</subject><subject>Image degradation</subject><subject>Image quality</subject><subject>Image reconstruction</subject><subject>Image restoration</subject><subject>Reasoning</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><recordid>eNotkE9LAzEQxYMoWKtfwFPAc3SS7Gaz3qT4DypiUfAWJrvZuqWb1GSL9NubWk8DM2_evPkRcsnhmgNUN4lDoWsGomCghFBsd0QmvKwkq4q6OiYTqPNIqfrzlJyltAIAKYFPyNsCh-fFLV04TMH3fknRtxSbsQ-ebmIYNuO--dOPX_QFB4u0C5Hies16z4J3tB9w6Wh0aQwR91vn5KTDdXIX_3VKPh7u32dPbP76-Dy7m7OGl2pkWou6aJHbuums4GhbBZJraZXOnwgpGivAaddqVwF2pQUsheTO2rbDWgg5JVcH35zye5vvm1XYRp9PGslLnt0rVWaVOKiaGFKKrjObmCPHneFg9ujMAZ3J6MwfOrOTvzmRYc0</recordid><startdate>202502</startdate><enddate>202502</enddate><creator>Tang, Aiqiang</creator><creator>Wu, Yan</creator><creator>Zhang, Yuwei</creator><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-8874-8886</orcidid></search><sort><creationdate>202502</creationdate><title>RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration</title><author>Tang, Aiqiang ; Wu, Yan ; Zhang, Yuwei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c156t-88294da1b9cfb21abd603183b68489232cb20e8ed8e70af5b0a5231ebbdfa9223</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>Computer vision</topic><topic>Efficiency</topic><topic>Feature maps</topic><topic>Image acquisition</topic><topic>Image degradation</topic><topic>Image quality</topic><topic>Image reconstruction</topic><topic>Image restoration</topic><topic>Reasoning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tang, Aiqiang</creatorcontrib><creatorcontrib>Wu, Yan</creatorcontrib><creatorcontrib>Zhang, Yuwei</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tang, Aiqiang</au><au>Wu, Yan</au><au>Zhang, Yuwei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><date>2025-02</date><risdate>2025</risdate><volume>55</volume><issue>4</issue><spage>258</spage><pages>258-</pages><artnum>258</artnum><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded images, achieving state-of-the-art restoration performance. However, existing methods exhibit limitations, including high computational burden and inadequate modeling of long-range dependencies. To address these issues, we propose a reasoning and action prompt-driven Mamba-based image restoration model, namely RamIR. Specifically, RamIR employs the Mamba block for long-range dependencies modeling with linear computational complexity relative to the feature map size. Inspired by Chain-of-Thought (CoT) prompting, we integrate Reasoning and Action (ReAct) prompts within the Mamba block. Hence, we utilize the capability of pretrained vision language (PVL) models to generate textual reasoning prompts describing the type and severity of degradations. Simultaneously, another output from PVL acts as action prompt representing the clean image caption. These prompts, employed in a CoT manner, enhance the network’s sensitivity to degradation and elicit targeted recovery actions tailored to different reasoning prompts. Additionally, we explore the seamless interaction between Mamba blocks and prompts, introducing a novel prompt-driven module (PDM) to facilitate prompt utilization. Extensive experimental results demonstrate the superior performance of RamIR, highlighting its advantages in terms of input scaling efficiency over existing benchmark models for all-in-one image restoration.</abstract><cop>Boston</cop><pub>Springer Nature B.V</pub><doi>10.1007/s10489-024-06226-y</doi><orcidid>https://orcid.org/0000-0002-8874-8886</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0924-669X
ispartof	Applied intelligence (Dordrecht, Netherlands), 2025-02, Vol.55 (4), p.258, Article 258
issn	0924-669X 1573-7497
language	eng
recordid	cdi_proquest_journals_3151294765
source	SpringerLink Journals (MCLS)
subjects	Computer vision Efficiency Feature maps Image acquisition Image degradation Image quality Image reconstruction Image restoration Reasoning
title	RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T19%3A24%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=RamIR:%20Reasoning%20and%20action%20prompting%20with%20Mamba%20for%20all-in-one%20image%20restoration&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Tang,%20Aiqiang&rft.date=2025-02&rft.volume=55&rft.issue=4&rft.spage=258&rft.pages=258-&rft.artnum=258&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-024-06226-y&rft_dat=%3Cproquest_cross%3E3151294765%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3151294765&rft_id=info:pmid/&rfr_iscdi=true