CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion
Infrared and visible image fusion targets to provide an informative image by combining complementary information from different sensors. Existing learning-based fusion approaches attempt to construct various loss functions to preserve complementary features, while neglecting to discover the inter-re...
Gespeichert in:
Veröffentlicht in: | International journal of computer vision 2024-05, Vol.132 (5), p.1748-1775 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1775 |
---|---|
container_issue | 5 |
container_start_page | 1748 |
container_title | International journal of computer vision |
container_volume | 132 |
creator | Liu, Jinyuan Lin, Runjia Wu, Guanyao Liu, Risheng Luo, Zhongxuan Fan, Xin |
description | Infrared and visible image fusion targets to provide an informative image by combining complementary information from different sensors. Existing learning-based fusion approaches attempt to construct various loss functions to preserve complementary features, while neglecting to discover the inter-relationship between the two modalities, leading to redundant or even invalid information on the fusion results. Moreover, most methods focus on strengthening the network with an increase in depth while neglecting the importance of feature transmission, causing vital information degeneration. To alleviate these issues, we propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion in an end-to-end manner. Concretely, to simultaneously retain typical features from both modalities and to avoid artifacts emerging on the fused result, we develop a coupled contrastive constraint in our loss function. In a fused image, its foreground target/background detail part is pulled close to the infrared/visible source and pushed far away from the visible/infrared source in the representation space. We further exploit image characteristics to provide data-sensitive weights, allowing our loss function to build a more reliable relationship with source images. A multi-level attention module is established to learn rich hierarchical feature representation and to comprehensively transfer features in the fusion process. We also apply the proposed CoCoNet on medical image fusion of different types, e.g., magnetic resonance image, positron emission tomography image, and single photon emission computed tomography image. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation, especially in preserving prominent targets and recovering vital textural details. |
doi_str_mv | 10.1007/s11263-023-01952-1 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3051754345</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3051754345</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-abac0c21e11dcb27c7cc8888d0c1286e98a14957fad1afcf32284491f0db05403</originalsourceid><addsrcrecordid>eNp9kMFOwzAQRC0EEqXwA5wscQ7s2nGTcENVC0gFLnC2XGdTAmlcbKeIv8fQStzY1Woub2alYewc4RIBiquAKCYyA5EOKyUyPGAjVIXMMAd1yEZQCcjUpMJjdhLCGwCIUsgRc9O0jxSv-dQNm47qpH30JsR2S3xBxvdtv-KJ-HT-nX-28ZU_DF1ss4621PE5mTh44rM-0HrZEW-c3wNrV5uujV_8fm1WxOdDaF1_yo4a0wU62-uYvcxnz9O7bPF0ez-9WWRWYhUzszQWrEBCrO1SFLawtkxTg0VRTqgqDeaVKhpTo2lsI4Uo87zCBuolqBzkmF3scjfefQwUon5zg-_TSy1BYaFymatEiR1lvQvBU6M3vl0b_6UR9E-xelesTsXq32I1JpPcmUKC-xX5v-h_XN-8-3wq</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3051754345</pqid></control><display><type>article</type><title>CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion</title><source>Springer Nature - Complete Springer Journals</source><creator>Liu, Jinyuan ; Lin, Runjia ; Wu, Guanyao ; Liu, Risheng ; Luo, Zhongxuan ; Fan, Xin</creator><creatorcontrib>Liu, Jinyuan ; Lin, Runjia ; Wu, Guanyao ; Liu, Risheng ; Luo, Zhongxuan ; Fan, Xin</creatorcontrib><description>Infrared and visible image fusion targets to provide an informative image by combining complementary information from different sensors. Existing learning-based fusion approaches attempt to construct various loss functions to preserve complementary features, while neglecting to discover the inter-relationship between the two modalities, leading to redundant or even invalid information on the fusion results. Moreover, most methods focus on strengthening the network with an increase in depth while neglecting the importance of feature transmission, causing vital information degeneration. To alleviate these issues, we propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion in an end-to-end manner. Concretely, to simultaneously retain typical features from both modalities and to avoid artifacts emerging on the fused result, we develop a coupled contrastive constraint in our loss function. In a fused image, its foreground target/background detail part is pulled close to the infrared/visible source and pushed far away from the visible/infrared source in the representation space. We further exploit image characteristics to provide data-sensitive weights, allowing our loss function to build a more reliable relationship with source images. A multi-level attention module is established to learn rich hierarchical feature representation and to comprehensively transfer features in the fusion process. We also apply the proposed CoCoNet on medical image fusion of different types, e.g., magnetic resonance image, positron emission tomography image, and single photon emission computed tomography image. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation, especially in preserving prominent targets and recovering vital textural details.</description><identifier>ISSN: 0920-5691</identifier><identifier>EISSN: 1573-1405</identifier><identifier>DOI: 10.1007/s11263-023-01952-1</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Computed tomography ; Computer Imaging ; Computer Science ; Computer vision ; Degeneration ; Image Processing and Computer Vision ; Infrared imagery ; Learning ; Magnetic resonance imaging ; Medical imaging ; Pattern Recognition ; Pattern Recognition and Graphics ; Photon emission ; Positron emission ; Representations ; Tomography ; Vision</subject><ispartof>International journal of computer vision, 2024-05, Vol.132 (5), p.1748-1775</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-abac0c21e11dcb27c7cc8888d0c1286e98a14957fad1afcf32284491f0db05403</citedby><cites>FETCH-LOGICAL-c319t-abac0c21e11dcb27c7cc8888d0c1286e98a14957fad1afcf32284491f0db05403</cites><orcidid>0000-0003-2085-2676</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11263-023-01952-1$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11263-023-01952-1$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Liu, Jinyuan</creatorcontrib><creatorcontrib>Lin, Runjia</creatorcontrib><creatorcontrib>Wu, Guanyao</creatorcontrib><creatorcontrib>Liu, Risheng</creatorcontrib><creatorcontrib>Luo, Zhongxuan</creatorcontrib><creatorcontrib>Fan, Xin</creatorcontrib><title>CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion</title><title>International journal of computer vision</title><addtitle>Int J Comput Vis</addtitle><description>Infrared and visible image fusion targets to provide an informative image by combining complementary information from different sensors. Existing learning-based fusion approaches attempt to construct various loss functions to preserve complementary features, while neglecting to discover the inter-relationship between the two modalities, leading to redundant or even invalid information on the fusion results. Moreover, most methods focus on strengthening the network with an increase in depth while neglecting the importance of feature transmission, causing vital information degeneration. To alleviate these issues, we propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion in an end-to-end manner. Concretely, to simultaneously retain typical features from both modalities and to avoid artifacts emerging on the fused result, we develop a coupled contrastive constraint in our loss function. In a fused image, its foreground target/background detail part is pulled close to the infrared/visible source and pushed far away from the visible/infrared source in the representation space. We further exploit image characteristics to provide data-sensitive weights, allowing our loss function to build a more reliable relationship with source images. A multi-level attention module is established to learn rich hierarchical feature representation and to comprehensively transfer features in the fusion process. We also apply the proposed CoCoNet on medical image fusion of different types, e.g., magnetic resonance image, positron emission tomography image, and single photon emission computed tomography image. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation, especially in preserving prominent targets and recovering vital textural details.</description><subject>Artificial Intelligence</subject><subject>Computed tomography</subject><subject>Computer Imaging</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Degeneration</subject><subject>Image Processing and Computer Vision</subject><subject>Infrared imagery</subject><subject>Learning</subject><subject>Magnetic resonance imaging</subject><subject>Medical imaging</subject><subject>Pattern Recognition</subject><subject>Pattern Recognition and Graphics</subject><subject>Photon emission</subject><subject>Positron emission</subject><subject>Representations</subject><subject>Tomography</subject><subject>Vision</subject><issn>0920-5691</issn><issn>1573-1405</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kMFOwzAQRC0EEqXwA5wscQ7s2nGTcENVC0gFLnC2XGdTAmlcbKeIv8fQStzY1Woub2alYewc4RIBiquAKCYyA5EOKyUyPGAjVIXMMAd1yEZQCcjUpMJjdhLCGwCIUsgRc9O0jxSv-dQNm47qpH30JsR2S3xBxvdtv-KJ-HT-nX-28ZU_DF1ss4621PE5mTh44rM-0HrZEW-c3wNrV5uujV_8fm1WxOdDaF1_yo4a0wU62-uYvcxnz9O7bPF0ez-9WWRWYhUzszQWrEBCrO1SFLawtkxTg0VRTqgqDeaVKhpTo2lsI4Uo87zCBuolqBzkmF3scjfefQwUon5zg-_TSy1BYaFymatEiR1lvQvBU6M3vl0b_6UR9E-xelesTsXq32I1JpPcmUKC-xX5v-h_XN-8-3wq</recordid><startdate>20240501</startdate><enddate>20240501</enddate><creator>Liu, Jinyuan</creator><creator>Lin, Runjia</creator><creator>Wu, Guanyao</creator><creator>Liu, Risheng</creator><creator>Luo, Zhongxuan</creator><creator>Fan, Xin</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-2085-2676</orcidid></search><sort><creationdate>20240501</creationdate><title>CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion</title><author>Liu, Jinyuan ; Lin, Runjia ; Wu, Guanyao ; Liu, Risheng ; Luo, Zhongxuan ; Fan, Xin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-abac0c21e11dcb27c7cc8888d0c1286e98a14957fad1afcf32284491f0db05403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial Intelligence</topic><topic>Computed tomography</topic><topic>Computer Imaging</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Degeneration</topic><topic>Image Processing and Computer Vision</topic><topic>Infrared imagery</topic><topic>Learning</topic><topic>Magnetic resonance imaging</topic><topic>Medical imaging</topic><topic>Pattern Recognition</topic><topic>Pattern Recognition and Graphics</topic><topic>Photon emission</topic><topic>Positron emission</topic><topic>Representations</topic><topic>Tomography</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Jinyuan</creatorcontrib><creatorcontrib>Lin, Runjia</creatorcontrib><creatorcontrib>Wu, Guanyao</creatorcontrib><creatorcontrib>Liu, Risheng</creatorcontrib><creatorcontrib>Luo, Zhongxuan</creatorcontrib><creatorcontrib>Fan, Xin</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>International journal of computer vision</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Jinyuan</au><au>Lin, Runjia</au><au>Wu, Guanyao</au><au>Liu, Risheng</au><au>Luo, Zhongxuan</au><au>Fan, Xin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion</atitle><jtitle>International journal of computer vision</jtitle><stitle>Int J Comput Vis</stitle><date>2024-05-01</date><risdate>2024</risdate><volume>132</volume><issue>5</issue><spage>1748</spage><epage>1775</epage><pages>1748-1775</pages><issn>0920-5691</issn><eissn>1573-1405</eissn><abstract>Infrared and visible image fusion targets to provide an informative image by combining complementary information from different sensors. Existing learning-based fusion approaches attempt to construct various loss functions to preserve complementary features, while neglecting to discover the inter-relationship between the two modalities, leading to redundant or even invalid information on the fusion results. Moreover, most methods focus on strengthening the network with an increase in depth while neglecting the importance of feature transmission, causing vital information degeneration. To alleviate these issues, we propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion in an end-to-end manner. Concretely, to simultaneously retain typical features from both modalities and to avoid artifacts emerging on the fused result, we develop a coupled contrastive constraint in our loss function. In a fused image, its foreground target/background detail part is pulled close to the infrared/visible source and pushed far away from the visible/infrared source in the representation space. We further exploit image characteristics to provide data-sensitive weights, allowing our loss function to build a more reliable relationship with source images. A multi-level attention module is established to learn rich hierarchical feature representation and to comprehensively transfer features in the fusion process. We also apply the proposed CoCoNet on medical image fusion of different types, e.g., magnetic resonance image, positron emission tomography image, and single photon emission computed tomography image. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation, especially in preserving prominent targets and recovering vital textural details.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11263-023-01952-1</doi><tpages>28</tpages><orcidid>https://orcid.org/0000-0003-2085-2676</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0920-5691 |
ispartof | International journal of computer vision, 2024-05, Vol.132 (5), p.1748-1775 |
issn | 0920-5691 1573-1405 |
language | eng |
recordid | cdi_proquest_journals_3051754345 |
source | Springer Nature - Complete Springer Journals |
subjects | Artificial Intelligence Computed tomography Computer Imaging Computer Science Computer vision Degeneration Image Processing and Computer Vision Infrared imagery Learning Magnetic resonance imaging Medical imaging Pattern Recognition Pattern Recognition and Graphics Photon emission Positron emission Representations Tomography Vision |
title | CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T07%3A45%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CoCoNet:%20Coupled%20Contrastive%20Learning%20Network%20with%20Multi-level%20Feature%20Ensemble%20for%20Multi-modality%20Image%20Fusion&rft.jtitle=International%20journal%20of%20computer%20vision&rft.au=Liu,%20Jinyuan&rft.date=2024-05-01&rft.volume=132&rft.issue=5&rft.spage=1748&rft.epage=1775&rft.pages=1748-1775&rft.issn=0920-5691&rft.eissn=1573-1405&rft_id=info:doi/10.1007/s11263-023-01952-1&rft_dat=%3Cproquest_cross%3E3051754345%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3051754345&rft_id=info:pmid/&rfr_iscdi=true |