TGFuse: An Infrared and Visible Image Fusion Approach Based on Transformer and Generative Adversarial Network

The end-to-end image fusion framework has achieved promising performance, with dedicated convolutional networks aggregating the multi-modal local appearance. However, long-range dependencies are directly neglected in existing CNN fusion approaches, impeding balancing the entire image-level perceptio...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing 2023-05, Vol.PP, p.1-1
Hauptverfasser:	Rao, Dongyu, Xu, Tianyang, Wu, Xiao-Jun
Format:	Artikel
Sprache:	eng
Schlagworte:	decision-level fusion Feature extraction Generative adversarial networks Image fusion RGBT tracking Task analysis temporal information Training Transformers visual object tracking Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1
container_issue
container_start_page	1
container_title	IEEE transactions on image processing
container_volume	PP
creator	Rao, Dongyu Xu, Tianyang Wu, Xiao-Jun
description	The end-to-end image fusion framework has achieved promising performance, with dedicated convolutional networks aggregating the multi-modal local appearance. However, long-range dependencies are directly neglected in existing CNN fusion approaches, impeding balancing the entire image-level perception for complex scenario fusion. In this paper, therefore, we propose an infrared and visible image fusion algorithm based on the transformer module and adversarial learning. Inspired by the global interaction power, we use the transformer technique to learn the effective global fusion relations. In particular, shallow features extracted by CNN are interacted in the proposed transformer fusion module to refine the fusion relationship within the spatial scope and across channels simultaneously. Besides, adversarial learning is designed in the training process to improve the output discrimination via imposing competitive consistency from the inputs, reflecting the specific characteristics in infrared and visible images. The experimental performance demonstrates the effectiveness of the proposed modules, with superior improvement against the state-of-the-art, generalising a novel paradigm via transformer and adversarial learning in the fusion task.
doi_str_mv	10.1109/TIP.2023.3273451
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TIP_2023_3273451</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10122870</ieee_id><sourcerecordid>2812507246</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2771-3bae4dba9704f6ca84aba1d66de943b6c908bcf9e0b99782e740216d8aaa847d3</originalsourceid><addsrcrecordid>eNpNkD1PwzAQhi0EolDYGRDyyJLir8YxW6loqVQBQ2GNLvEFAvkodlPEv8elBTHd6fS8r04PIWecDThn5moxexwIJuRACi3VkO-RI24UjxhTYj_sbKgjzZXpkWPv3xjjgYkPSU9qHktphkekXkwnncdrOmrorCkcOLQUGkufS19mFdJZDS9IA1O2DR0tl66F_JXegA9cuCwcNL5oXY3uJzbFBh2syjXSkV2j8-BKqOg9rj5b935CDgqoPJ7uZp88TW4X47to_jCdjUfzKBda80hmgMpmYDRTRZxDoiADbuPYolEyi3PDkiwvDLLMGJ0I1IoJHtsEILDayj653PaGdz869Ku0Ln2OVQUNtp1PRcLFkGmh4oCyLZq71nuHRbp0ZQ3uK-Us3UhOg-R0IzndSQ6Ri117l9Vo_wK_VgNwvgVKRPzXx4VINJPfsDiAtw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2812507246</pqid></control><display><type>article</type><title>TGFuse: An Infrared and Visible Image Fusion Approach Based on Transformer and Generative Adversarial Network</title><source>IEEE Electronic Library (IEL)</source><creator>Rao, Dongyu ; Xu, Tianyang ; Wu, Xiao-Jun</creator><creatorcontrib>Rao, Dongyu ; Xu, Tianyang ; Wu, Xiao-Jun</creatorcontrib><description>The end-to-end image fusion framework has achieved promising performance, with dedicated convolutional networks aggregating the multi-modal local appearance. However, long-range dependencies are directly neglected in existing CNN fusion approaches, impeding balancing the entire image-level perception for complex scenario fusion. In this paper, therefore, we propose an infrared and visible image fusion algorithm based on the transformer module and adversarial learning. Inspired by the global interaction power, we use the transformer technique to learn the effective global fusion relations. In particular, shallow features extracted by CNN are interacted in the proposed transformer fusion module to refine the fusion relationship within the spatial scope and across channels simultaneously. Besides, adversarial learning is designed in the training process to improve the output discrimination via imposing competitive consistency from the inputs, reflecting the specific characteristics in infrared and visible images. The experimental performance demonstrates the effectiveness of the proposed modules, with superior improvement against the state-of-the-art, generalising a novel paradigm via transformer and adversarial learning in the fusion task.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2023.3273451</identifier><identifier>PMID: 37163395</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>decision-level fusion ; Feature extraction ; Generative adversarial networks ; Image fusion ; RGBT tracking ; Task analysis ; temporal information ; Training ; Transformers ; visual object tracking ; Visualization</subject><ispartof>IEEE transactions on image processing, 2023-05, Vol.PP, p.1-1</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2771-3bae4dba9704f6ca84aba1d66de943b6c908bcf9e0b99782e740216d8aaa847d3</citedby><orcidid>0000-0003-4444-4896 ; 0000-0002-9015-3128 ; 0000-0002-0310-5778</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10122870$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10122870$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37163395$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Rao, Dongyu</creatorcontrib><creatorcontrib>Xu, Tianyang</creatorcontrib><creatorcontrib>Wu, Xiao-Jun</creatorcontrib><title>TGFuse: An Infrared and Visible Image Fusion Approach Based on Transformer and Generative Adversarial Network</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>The end-to-end image fusion framework has achieved promising performance, with dedicated convolutional networks aggregating the multi-modal local appearance. However, long-range dependencies are directly neglected in existing CNN fusion approaches, impeding balancing the entire image-level perception for complex scenario fusion. In this paper, therefore, we propose an infrared and visible image fusion algorithm based on the transformer module and adversarial learning. Inspired by the global interaction power, we use the transformer technique to learn the effective global fusion relations. In particular, shallow features extracted by CNN are interacted in the proposed transformer fusion module to refine the fusion relationship within the spatial scope and across channels simultaneously. Besides, adversarial learning is designed in the training process to improve the output discrimination via imposing competitive consistency from the inputs, reflecting the specific characteristics in infrared and visible images. The experimental performance demonstrates the effectiveness of the proposed modules, with superior improvement against the state-of-the-art, generalising a novel paradigm via transformer and adversarial learning in the fusion task.</description><subject>decision-level fusion</subject><subject>Feature extraction</subject><subject>Generative adversarial networks</subject><subject>Image fusion</subject><subject>RGBT tracking</subject><subject>Task analysis</subject><subject>temporal information</subject><subject>Training</subject><subject>Transformers</subject><subject>visual object tracking</subject><subject>Visualization</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkD1PwzAQhi0EolDYGRDyyJLir8YxW6loqVQBQ2GNLvEFAvkodlPEv8elBTHd6fS8r04PIWecDThn5moxexwIJuRACi3VkO-RI24UjxhTYj_sbKgjzZXpkWPv3xjjgYkPSU9qHktphkekXkwnncdrOmrorCkcOLQUGkufS19mFdJZDS9IA1O2DR0tl66F_JXegA9cuCwcNL5oXY3uJzbFBh2syjXSkV2j8-BKqOg9rj5b935CDgqoPJ7uZp88TW4X47to_jCdjUfzKBda80hmgMpmYDRTRZxDoiADbuPYolEyi3PDkiwvDLLMGJ0I1IoJHtsEILDayj653PaGdz869Ku0Ln2OVQUNtp1PRcLFkGmh4oCyLZq71nuHRbp0ZQ3uK-Us3UhOg-R0IzndSQ6Ri117l9Vo_wK_VgNwvgVKRPzXx4VINJPfsDiAtw</recordid><startdate>20230510</startdate><enddate>20230510</enddate><creator>Rao, Dongyu</creator><creator>Xu, Tianyang</creator><creator>Wu, Xiao-Jun</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-4444-4896</orcidid><orcidid>https://orcid.org/0000-0002-9015-3128</orcidid><orcidid>https://orcid.org/0000-0002-0310-5778</orcidid></search><sort><creationdate>20230510</creationdate><title>TGFuse: An Infrared and Visible Image Fusion Approach Based on Transformer and Generative Adversarial Network</title><author>Rao, Dongyu ; Xu, Tianyang ; Wu, Xiao-Jun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2771-3bae4dba9704f6ca84aba1d66de943b6c908bcf9e0b99782e740216d8aaa847d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>decision-level fusion</topic><topic>Feature extraction</topic><topic>Generative adversarial networks</topic><topic>Image fusion</topic><topic>RGBT tracking</topic><topic>Task analysis</topic><topic>temporal information</topic><topic>Training</topic><topic>Transformers</topic><topic>visual object tracking</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rao, Dongyu</creatorcontrib><creatorcontrib>Xu, Tianyang</creatorcontrib><creatorcontrib>Wu, Xiao-Jun</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Rao, Dongyu</au><au>Xu, Tianyang</au><au>Wu, Xiao-Jun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TGFuse: An Infrared and Visible Image Fusion Approach Based on Transformer and Generative Adversarial Network</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2023-05-10</date><risdate>2023</risdate><volume>PP</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>The end-to-end image fusion framework has achieved promising performance, with dedicated convolutional networks aggregating the multi-modal local appearance. However, long-range dependencies are directly neglected in existing CNN fusion approaches, impeding balancing the entire image-level perception for complex scenario fusion. In this paper, therefore, we propose an infrared and visible image fusion algorithm based on the transformer module and adversarial learning. Inspired by the global interaction power, we use the transformer technique to learn the effective global fusion relations. In particular, shallow features extracted by CNN are interacted in the proposed transformer fusion module to refine the fusion relationship within the spatial scope and across channels simultaneously. Besides, adversarial learning is designed in the training process to improve the output discrimination via imposing competitive consistency from the inputs, reflecting the specific characteristics in infrared and visible images. The experimental performance demonstrates the effectiveness of the proposed modules, with superior improvement against the state-of-the-art, generalising a novel paradigm via transformer and adversarial learning in the fusion task.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>37163395</pmid><doi>10.1109/TIP.2023.3273451</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0003-4444-4896</orcidid><orcidid>https://orcid.org/0000-0002-9015-3128</orcidid><orcidid>https://orcid.org/0000-0002-0310-5778</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1057-7149
ispartof	IEEE transactions on image processing, 2023-05, Vol.PP, p.1-1
issn	1057-7149 1941-0042
language	eng
recordid	cdi_crossref_primary_10_1109_TIP_2023_3273451
source	IEEE Electronic Library (IEL)
subjects	decision-level fusion Feature extraction Generative adversarial networks Image fusion RGBT tracking Task analysis temporal information Training Transformers visual object tracking Visualization
title	TGFuse: An Infrared and Visible Image Fusion Approach Based on Transformer and Generative Adversarial Network
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T12%3A24%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TGFuse:%20An%20Infrared%20and%20Visible%20Image%20Fusion%20Approach%20Based%20on%20Transformer%20and%20Generative%20Adversarial%20Network&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Rao,%20Dongyu&rft.date=2023-05-10&rft.volume=PP&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2023.3273451&rft_dat=%3Cproquest_RIE%3E2812507246%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2812507246&rft_id=info:pmid/37163395&rft_ieee_id=10122870&rfr_iscdi=true