CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion

Infrared and visible image fusion targets to provide an informative image by combining complementary information from different sensors. Existing learning-based fusion approaches attempt to construct various loss functions to preserve complementary features, while neglecting to discover the inter-re...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of computer vision 2024-05, Vol.132 (5), p.1748-1775
Hauptverfasser: Liu, Jinyuan, Lin, Runjia, Wu, Guanyao, Liu, Risheng, Luo, Zhongxuan, Fan, Xin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1775
container_issue 5
container_start_page 1748
container_title International journal of computer vision
container_volume 132
creator Liu, Jinyuan
Lin, Runjia
Wu, Guanyao
Liu, Risheng
Luo, Zhongxuan
Fan, Xin
description Infrared and visible image fusion targets to provide an informative image by combining complementary information from different sensors. Existing learning-based fusion approaches attempt to construct various loss functions to preserve complementary features, while neglecting to discover the inter-relationship between the two modalities, leading to redundant or even invalid information on the fusion results. Moreover, most methods focus on strengthening the network with an increase in depth while neglecting the importance of feature transmission, causing vital information degeneration. To alleviate these issues, we propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion in an end-to-end manner. Concretely, to simultaneously retain typical features from both modalities and to avoid artifacts emerging on the fused result, we develop a coupled contrastive constraint in our loss function. In a fused image, its foreground target/background detail part is pulled close to the infrared/visible source and pushed far away from the visible/infrared source in the representation space. We further exploit image characteristics to provide data-sensitive weights, allowing our loss function to build a more reliable relationship with source images. A multi-level attention module is established to learn rich hierarchical feature representation and to comprehensively transfer features in the fusion process. We also apply the proposed CoCoNet on medical image fusion of different types, e.g., magnetic resonance image, positron emission tomography image, and single photon emission computed tomography image. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation, especially in preserving prominent targets and recovering vital textural details.
doi_str_mv 10.1007/s11263-023-01952-1
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3051754345</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3051754345</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-abac0c21e11dcb27c7cc8888d0c1286e98a14957fad1afcf32284491f0db05403</originalsourceid><addsrcrecordid>eNp9kMFOwzAQRC0EEqXwA5wscQ7s2nGTcENVC0gFLnC2XGdTAmlcbKeIv8fQStzY1Woub2alYewc4RIBiquAKCYyA5EOKyUyPGAjVIXMMAd1yEZQCcjUpMJjdhLCGwCIUsgRc9O0jxSv-dQNm47qpH30JsR2S3xBxvdtv-KJ-HT-nX-28ZU_DF1ss4621PE5mTh44rM-0HrZEW-c3wNrV5uujV_8fm1WxOdDaF1_yo4a0wU62-uYvcxnz9O7bPF0ez-9WWRWYhUzszQWrEBCrO1SFLawtkxTg0VRTqgqDeaVKhpTo2lsI4Uo87zCBuolqBzkmF3scjfefQwUon5zg-_TSy1BYaFymatEiR1lvQvBU6M3vl0b_6UR9E-xelesTsXq32I1JpPcmUKC-xX5v-h_XN-8-3wq</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3051754345</pqid></control><display><type>article</type><title>CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion</title><source>Springer Nature - Complete Springer Journals</source><creator>Liu, Jinyuan ; Lin, Runjia ; Wu, Guanyao ; Liu, Risheng ; Luo, Zhongxuan ; Fan, Xin</creator><creatorcontrib>Liu, Jinyuan ; Lin, Runjia ; Wu, Guanyao ; Liu, Risheng ; Luo, Zhongxuan ; Fan, Xin</creatorcontrib><description>Infrared and visible image fusion targets to provide an informative image by combining complementary information from different sensors. Existing learning-based fusion approaches attempt to construct various loss functions to preserve complementary features, while neglecting to discover the inter-relationship between the two modalities, leading to redundant or even invalid information on the fusion results. Moreover, most methods focus on strengthening the network with an increase in depth while neglecting the importance of feature transmission, causing vital information degeneration. To alleviate these issues, we propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion in an end-to-end manner. Concretely, to simultaneously retain typical features from both modalities and to avoid artifacts emerging on the fused result, we develop a coupled contrastive constraint in our loss function. In a fused image, its foreground target/background detail part is pulled close to the infrared/visible source and pushed far away from the visible/infrared source in the representation space. We further exploit image characteristics to provide data-sensitive weights, allowing our loss function to build a more reliable relationship with source images. A multi-level attention module is established to learn rich hierarchical feature representation and to comprehensively transfer features in the fusion process. We also apply the proposed CoCoNet on medical image fusion of different types, e.g., magnetic resonance image, positron emission tomography image, and single photon emission computed tomography image. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation, especially in preserving prominent targets and recovering vital textural details.</description><identifier>ISSN: 0920-5691</identifier><identifier>EISSN: 1573-1405</identifier><identifier>DOI: 10.1007/s11263-023-01952-1</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Computed tomography ; Computer Imaging ; Computer Science ; Computer vision ; Degeneration ; Image Processing and Computer Vision ; Infrared imagery ; Learning ; Magnetic resonance imaging ; Medical imaging ; Pattern Recognition ; Pattern Recognition and Graphics ; Photon emission ; Positron emission ; Representations ; Tomography ; Vision</subject><ispartof>International journal of computer vision, 2024-05, Vol.132 (5), p.1748-1775</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-abac0c21e11dcb27c7cc8888d0c1286e98a14957fad1afcf32284491f0db05403</citedby><cites>FETCH-LOGICAL-c319t-abac0c21e11dcb27c7cc8888d0c1286e98a14957fad1afcf32284491f0db05403</cites><orcidid>0000-0003-2085-2676</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11263-023-01952-1$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11263-023-01952-1$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Liu, Jinyuan</creatorcontrib><creatorcontrib>Lin, Runjia</creatorcontrib><creatorcontrib>Wu, Guanyao</creatorcontrib><creatorcontrib>Liu, Risheng</creatorcontrib><creatorcontrib>Luo, Zhongxuan</creatorcontrib><creatorcontrib>Fan, Xin</creatorcontrib><title>CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion</title><title>International journal of computer vision</title><addtitle>Int J Comput Vis</addtitle><description>Infrared and visible image fusion targets to provide an informative image by combining complementary information from different sensors. Existing learning-based fusion approaches attempt to construct various loss functions to preserve complementary features, while neglecting to discover the inter-relationship between the two modalities, leading to redundant or even invalid information on the fusion results. Moreover, most methods focus on strengthening the network with an increase in depth while neglecting the importance of feature transmission, causing vital information degeneration. To alleviate these issues, we propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion in an end-to-end manner. Concretely, to simultaneously retain typical features from both modalities and to avoid artifacts emerging on the fused result, we develop a coupled contrastive constraint in our loss function. In a fused image, its foreground target/background detail part is pulled close to the infrared/visible source and pushed far away from the visible/infrared source in the representation space. We further exploit image characteristics to provide data-sensitive weights, allowing our loss function to build a more reliable relationship with source images. A multi-level attention module is established to learn rich hierarchical feature representation and to comprehensively transfer features in the fusion process. We also apply the proposed CoCoNet on medical image fusion of different types, e.g., magnetic resonance image, positron emission tomography image, and single photon emission computed tomography image. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation, especially in preserving prominent targets and recovering vital textural details.</description><subject>Artificial Intelligence</subject><subject>Computed tomography</subject><subject>Computer Imaging</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Degeneration</subject><subject>Image Processing and Computer Vision</subject><subject>Infrared imagery</subject><subject>Learning</subject><subject>Magnetic resonance imaging</subject><subject>Medical imaging</subject><subject>Pattern Recognition</subject><subject>Pattern Recognition and Graphics</subject><subject>Photon emission</subject><subject>Positron emission</subject><subject>Representations</subject><subject>Tomography</subject><subject>Vision</subject><issn>0920-5691</issn><issn>1573-1405</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kMFOwzAQRC0EEqXwA5wscQ7s2nGTcENVC0gFLnC2XGdTAmlcbKeIv8fQStzY1Woub2alYewc4RIBiquAKCYyA5EOKyUyPGAjVIXMMAd1yEZQCcjUpMJjdhLCGwCIUsgRc9O0jxSv-dQNm47qpH30JsR2S3xBxvdtv-KJ-HT-nX-28ZU_DF1ss4621PE5mTh44rM-0HrZEW-c3wNrV5uujV_8fm1WxOdDaF1_yo4a0wU62-uYvcxnz9O7bPF0ez-9WWRWYhUzszQWrEBCrO1SFLawtkxTg0VRTqgqDeaVKhpTo2lsI4Uo87zCBuolqBzkmF3scjfefQwUon5zg-_TSy1BYaFymatEiR1lvQvBU6M3vl0b_6UR9E-xelesTsXq32I1JpPcmUKC-xX5v-h_XN-8-3wq</recordid><startdate>20240501</startdate><enddate>20240501</enddate><creator>Liu, Jinyuan</creator><creator>Lin, Runjia</creator><creator>Wu, Guanyao</creator><creator>Liu, Risheng</creator><creator>Luo, Zhongxuan</creator><creator>Fan, Xin</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-2085-2676</orcidid></search><sort><creationdate>20240501</creationdate><title>CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion</title><author>Liu, Jinyuan ; Lin, Runjia ; Wu, Guanyao ; Liu, Risheng ; Luo, Zhongxuan ; Fan, Xin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-abac0c21e11dcb27c7cc8888d0c1286e98a14957fad1afcf32284491f0db05403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial Intelligence</topic><topic>Computed tomography</topic><topic>Computer Imaging</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Degeneration</topic><topic>Image Processing and Computer Vision</topic><topic>Infrared imagery</topic><topic>Learning</topic><topic>Magnetic resonance imaging</topic><topic>Medical imaging</topic><topic>Pattern Recognition</topic><topic>Pattern Recognition and Graphics</topic><topic>Photon emission</topic><topic>Positron emission</topic><topic>Representations</topic><topic>Tomography</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Jinyuan</creatorcontrib><creatorcontrib>Lin, Runjia</creatorcontrib><creatorcontrib>Wu, Guanyao</creatorcontrib><creatorcontrib>Liu, Risheng</creatorcontrib><creatorcontrib>Luo, Zhongxuan</creatorcontrib><creatorcontrib>Fan, Xin</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>International journal of computer vision</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Jinyuan</au><au>Lin, Runjia</au><au>Wu, Guanyao</au><au>Liu, Risheng</au><au>Luo, Zhongxuan</au><au>Fan, Xin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion</atitle><jtitle>International journal of computer vision</jtitle><stitle>Int J Comput Vis</stitle><date>2024-05-01</date><risdate>2024</risdate><volume>132</volume><issue>5</issue><spage>1748</spage><epage>1775</epage><pages>1748-1775</pages><issn>0920-5691</issn><eissn>1573-1405</eissn><abstract>Infrared and visible image fusion targets to provide an informative image by combining complementary information from different sensors. Existing learning-based fusion approaches attempt to construct various loss functions to preserve complementary features, while neglecting to discover the inter-relationship between the two modalities, leading to redundant or even invalid information on the fusion results. Moreover, most methods focus on strengthening the network with an increase in depth while neglecting the importance of feature transmission, causing vital information degeneration. To alleviate these issues, we propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion in an end-to-end manner. Concretely, to simultaneously retain typical features from both modalities and to avoid artifacts emerging on the fused result, we develop a coupled contrastive constraint in our loss function. In a fused image, its foreground target/background detail part is pulled close to the infrared/visible source and pushed far away from the visible/infrared source in the representation space. We further exploit image characteristics to provide data-sensitive weights, allowing our loss function to build a more reliable relationship with source images. A multi-level attention module is established to learn rich hierarchical feature representation and to comprehensively transfer features in the fusion process. We also apply the proposed CoCoNet on medical image fusion of different types, e.g., magnetic resonance image, positron emission tomography image, and single photon emission computed tomography image. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation, especially in preserving prominent targets and recovering vital textural details.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11263-023-01952-1</doi><tpages>28</tpages><orcidid>https://orcid.org/0000-0003-2085-2676</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0920-5691
ispartof International journal of computer vision, 2024-05, Vol.132 (5), p.1748-1775
issn 0920-5691
1573-1405
language eng
recordid cdi_proquest_journals_3051754345
source Springer Nature - Complete Springer Journals
subjects Artificial Intelligence
Computed tomography
Computer Imaging
Computer Science
Computer vision
Degeneration
Image Processing and Computer Vision
Infrared imagery
Learning
Magnetic resonance imaging
Medical imaging
Pattern Recognition
Pattern Recognition and Graphics
Photon emission
Positron emission
Representations
Tomography
Vision
title CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T07%3A45%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CoCoNet:%20Coupled%20Contrastive%20Learning%20Network%20with%20Multi-level%20Feature%20Ensemble%20for%20Multi-modality%20Image%20Fusion&rft.jtitle=International%20journal%20of%20computer%20vision&rft.au=Liu,%20Jinyuan&rft.date=2024-05-01&rft.volume=132&rft.issue=5&rft.spage=1748&rft.epage=1775&rft.pages=1748-1775&rft.issn=0920-5691&rft.eissn=1573-1405&rft_id=info:doi/10.1007/s11263-023-01952-1&rft_dat=%3Cproquest_cross%3E3051754345%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3051754345&rft_id=info:pmid/&rfr_iscdi=true