RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration
All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded imag...
Gespeichert in:
Veröffentlicht in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2025-02, Vol.55 (4), p.258, Article 258 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 4 |
container_start_page | 258 |
container_title | Applied intelligence (Dordrecht, Netherlands) |
container_volume | 55 |
creator | Tang, Aiqiang Wu, Yan Zhang, Yuwei |
description | All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded images, achieving state-of-the-art restoration performance. However, existing methods exhibit limitations, including high computational burden and inadequate modeling of long-range dependencies. To address these issues, we propose a reasoning and action prompt-driven Mamba-based image restoration model, namely RamIR. Specifically, RamIR employs the Mamba block for long-range dependencies modeling with linear computational complexity relative to the feature map size. Inspired by Chain-of-Thought (CoT) prompting, we integrate Reasoning and Action (ReAct) prompts within the Mamba block. Hence, we utilize the capability of pretrained vision language (PVL) models to generate textual reasoning prompts describing the type and severity of degradations. Simultaneously, another output from PVL acts as action prompt representing the clean image caption. These prompts, employed in a CoT manner, enhance the network’s sensitivity to degradation and elicit targeted recovery actions tailored to different reasoning prompts. Additionally, we explore the seamless interaction between Mamba blocks and prompts, introducing a novel prompt-driven module (PDM) to facilitate prompt utilization. Extensive experimental results demonstrate the superior performance of RamIR, highlighting its advantages in terms of input scaling efficiency over existing benchmark models for all-in-one image restoration. |
doi_str_mv | 10.1007/s10489-024-06226-y |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3151294765</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3151294765</sourcerecordid><originalsourceid>FETCH-LOGICAL-c156t-88294da1b9cfb21abd603183b68489232cb20e8ed8e70af5b0a5231ebbdfa9223</originalsourceid><addsrcrecordid>eNotkE9LAzEQxYMoWKtfwFPAc3SS7Gaz3qT4DypiUfAWJrvZuqWb1GSL9NubWk8DM2_evPkRcsnhmgNUN4lDoWsGomCghFBsd0QmvKwkq4q6OiYTqPNIqfrzlJyltAIAKYFPyNsCh-fFLV04TMH3fknRtxSbsQ-ebmIYNuO--dOPX_QFB4u0C5Hies16z4J3tB9w6Wh0aQwR91vn5KTDdXIX_3VKPh7u32dPbP76-Dy7m7OGl2pkWou6aJHbuums4GhbBZJraZXOnwgpGivAaddqVwF2pQUsheTO2rbDWgg5JVcH35zye5vvm1XYRp9PGslLnt0rVWaVOKiaGFKKrjObmCPHneFg9ujMAZ3J6MwfOrOTvzmRYc0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3151294765</pqid></control><display><type>article</type><title>RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration</title><source>SpringerLink Journals (MCLS)</source><creator>Tang, Aiqiang ; Wu, Yan ; Zhang, Yuwei</creator><creatorcontrib>Tang, Aiqiang ; Wu, Yan ; Zhang, Yuwei</creatorcontrib><description>All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded images, achieving state-of-the-art restoration performance. However, existing methods exhibit limitations, including high computational burden and inadequate modeling of long-range dependencies. To address these issues, we propose a reasoning and action prompt-driven Mamba-based image restoration model, namely RamIR. Specifically, RamIR employs the Mamba block for long-range dependencies modeling with linear computational complexity relative to the feature map size. Inspired by Chain-of-Thought (CoT) prompting, we integrate Reasoning and Action (ReAct) prompts within the Mamba block. Hence, we utilize the capability of pretrained vision language (PVL) models to generate textual reasoning prompts describing the type and severity of degradations. Simultaneously, another output from PVL acts as action prompt representing the clean image caption. These prompts, employed in a CoT manner, enhance the network’s sensitivity to degradation and elicit targeted recovery actions tailored to different reasoning prompts. Additionally, we explore the seamless interaction between Mamba blocks and prompts, introducing a novel prompt-driven module (PDM) to facilitate prompt utilization. Extensive experimental results demonstrate the superior performance of RamIR, highlighting its advantages in terms of input scaling efficiency over existing benchmark models for all-in-one image restoration.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-024-06226-y</identifier><language>eng</language><publisher>Boston: Springer Nature B.V</publisher><subject>Computer vision ; Efficiency ; Feature maps ; Image acquisition ; Image degradation ; Image quality ; Image reconstruction ; Image restoration ; Reasoning</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2025-02, Vol.55 (4), p.258, Article 258</ispartof><rights>Copyright Springer Nature B.V. Feb 2025</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c156t-88294da1b9cfb21abd603183b68489232cb20e8ed8e70af5b0a5231ebbdfa9223</cites><orcidid>0000-0002-8874-8886</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Tang, Aiqiang</creatorcontrib><creatorcontrib>Wu, Yan</creatorcontrib><creatorcontrib>Zhang, Yuwei</creatorcontrib><title>RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration</title><title>Applied intelligence (Dordrecht, Netherlands)</title><description>All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded images, achieving state-of-the-art restoration performance. However, existing methods exhibit limitations, including high computational burden and inadequate modeling of long-range dependencies. To address these issues, we propose a reasoning and action prompt-driven Mamba-based image restoration model, namely RamIR. Specifically, RamIR employs the Mamba block for long-range dependencies modeling with linear computational complexity relative to the feature map size. Inspired by Chain-of-Thought (CoT) prompting, we integrate Reasoning and Action (ReAct) prompts within the Mamba block. Hence, we utilize the capability of pretrained vision language (PVL) models to generate textual reasoning prompts describing the type and severity of degradations. Simultaneously, another output from PVL acts as action prompt representing the clean image caption. These prompts, employed in a CoT manner, enhance the network’s sensitivity to degradation and elicit targeted recovery actions tailored to different reasoning prompts. Additionally, we explore the seamless interaction between Mamba blocks and prompts, introducing a novel prompt-driven module (PDM) to facilitate prompt utilization. Extensive experimental results demonstrate the superior performance of RamIR, highlighting its advantages in terms of input scaling efficiency over existing benchmark models for all-in-one image restoration.</description><subject>Computer vision</subject><subject>Efficiency</subject><subject>Feature maps</subject><subject>Image acquisition</subject><subject>Image degradation</subject><subject>Image quality</subject><subject>Image reconstruction</subject><subject>Image restoration</subject><subject>Reasoning</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><recordid>eNotkE9LAzEQxYMoWKtfwFPAc3SS7Gaz3qT4DypiUfAWJrvZuqWb1GSL9NubWk8DM2_evPkRcsnhmgNUN4lDoWsGomCghFBsd0QmvKwkq4q6OiYTqPNIqfrzlJyltAIAKYFPyNsCh-fFLV04TMH3fknRtxSbsQ-ebmIYNuO--dOPX_QFB4u0C5Hies16z4J3tB9w6Wh0aQwR91vn5KTDdXIX_3VKPh7u32dPbP76-Dy7m7OGl2pkWou6aJHbuums4GhbBZJraZXOnwgpGivAaddqVwF2pQUsheTO2rbDWgg5JVcH35zye5vvm1XYRp9PGslLnt0rVWaVOKiaGFKKrjObmCPHneFg9ujMAZ3J6MwfOrOTvzmRYc0</recordid><startdate>202502</startdate><enddate>202502</enddate><creator>Tang, Aiqiang</creator><creator>Wu, Yan</creator><creator>Zhang, Yuwei</creator><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-8874-8886</orcidid></search><sort><creationdate>202502</creationdate><title>RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration</title><author>Tang, Aiqiang ; Wu, Yan ; Zhang, Yuwei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c156t-88294da1b9cfb21abd603183b68489232cb20e8ed8e70af5b0a5231ebbdfa9223</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>Computer vision</topic><topic>Efficiency</topic><topic>Feature maps</topic><topic>Image acquisition</topic><topic>Image degradation</topic><topic>Image quality</topic><topic>Image reconstruction</topic><topic>Image restoration</topic><topic>Reasoning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tang, Aiqiang</creatorcontrib><creatorcontrib>Wu, Yan</creatorcontrib><creatorcontrib>Zhang, Yuwei</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tang, Aiqiang</au><au>Wu, Yan</au><au>Zhang, Yuwei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><date>2025-02</date><risdate>2025</risdate><volume>55</volume><issue>4</issue><spage>258</spage><pages>258-</pages><artnum>258</artnum><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>All-in-one image restoration aims to recover various degraded images using a unified model. To adaptively reconstruct high-quality images, recent prevalent CNN and Transformer based models incorporate learnable prompts to dynamically acquire degradation-specific knowledge for different degraded images, achieving state-of-the-art restoration performance. However, existing methods exhibit limitations, including high computational burden and inadequate modeling of long-range dependencies. To address these issues, we propose a reasoning and action prompt-driven Mamba-based image restoration model, namely RamIR. Specifically, RamIR employs the Mamba block for long-range dependencies modeling with linear computational complexity relative to the feature map size. Inspired by Chain-of-Thought (CoT) prompting, we integrate Reasoning and Action (ReAct) prompts within the Mamba block. Hence, we utilize the capability of pretrained vision language (PVL) models to generate textual reasoning prompts describing the type and severity of degradations. Simultaneously, another output from PVL acts as action prompt representing the clean image caption. These prompts, employed in a CoT manner, enhance the network’s sensitivity to degradation and elicit targeted recovery actions tailored to different reasoning prompts. Additionally, we explore the seamless interaction between Mamba blocks and prompts, introducing a novel prompt-driven module (PDM) to facilitate prompt utilization. Extensive experimental results demonstrate the superior performance of RamIR, highlighting its advantages in terms of input scaling efficiency over existing benchmark models for all-in-one image restoration.</abstract><cop>Boston</cop><pub>Springer Nature B.V</pub><doi>10.1007/s10489-024-06226-y</doi><orcidid>https://orcid.org/0000-0002-8874-8886</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0924-669X |
ispartof | Applied intelligence (Dordrecht, Netherlands), 2025-02, Vol.55 (4), p.258, Article 258 |
issn | 0924-669X 1573-7497 |
language | eng |
recordid | cdi_proquest_journals_3151294765 |
source | SpringerLink Journals (MCLS) |
subjects | Computer vision Efficiency Feature maps Image acquisition Image degradation Image quality Image reconstruction Image restoration Reasoning |
title | RamIR: Reasoning and action prompting with Mamba for all-in-one image restoration |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T19%3A24%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=RamIR:%20Reasoning%20and%20action%20prompting%20with%20Mamba%20for%20all-in-one%20image%20restoration&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Tang,%20Aiqiang&rft.date=2025-02&rft.volume=55&rft.issue=4&rft.spage=258&rft.pages=258-&rft.artnum=258&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-024-06226-y&rft_dat=%3Cproquest_cross%3E3151294765%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3151294765&rft_id=info:pmid/&rfr_iscdi=true |