CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation

Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers in biology and medicine 2024-01, Vol.168, p.107803-107803, Article 107803
Hauptverfasser: Wang, Cheng, Wang, Le, Wang, Nuoqi, Wei, Xiaoling, Feng, Ting, Wu, Minfeng, Yao, Qi, Zhang, Rongjun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 107803
container_issue
container_start_page 107803
container_title Computers in biology and medicine
container_volume 168
creator Wang, Cheng
Wang, Le
Wang, Nuoqi
Wei, Xiaoling
Feng, Ting
Wu, Minfeng
Yao, Qi
Zhang, Rongjun
description Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network structure, called CFATransUnet, with Transformer and CNN blocks as the backbone network, equipped with Channel-wise Cross Fusion Attention and Transformer (CCFAT) module, containing Channel-wise Cross Fusion Transformer (CCFT) and Channel-wise Cross Fusion Attention (CCFA). Specifically, we use a Transformer and CNN blocks to construct the encoder and decoder for adequate extraction and fusion of long-range and local semantic features. The CCFT module utilizes the self-attention mechanism to reintegrate semantic information from different stages into cross-level global features to reduce the semantic asymmetry between features at different levels. The CCFA module adaptively acquires the importance of each feature channel based on a global perspective in a network learning manner, enhancing effective information grasping and suppressing non-important features to mitigate semantic gaps. The combination of CCFT and CCFA can guide the effective fusion of different levels of features more powerfully with a global perspective. The consistent architecture of the encoder and decoder also alleviates the semantic gap. Experimental results suggest that the proposed CFATransUnet achieves state-of-the-art performance on four datasets. The code is available at https://github.com/CPU0808066/CFATransUnet.
doi_str_mv 10.1016/j.compbiomed.2023.107803
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2902970181</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2910203167</sourcerecordid><originalsourceid>FETCH-LOGICAL-c343t-2fad2abcce6b5ac852de81d757430dba04dd5ca307fb3a1d49057f060a376533</originalsourceid><addsrcrecordid>eNpdkUtLxDAUhYMoOj7-ggTcuOl482oy7mR0VBDcjOuQJql2aNMxaRH_vamjCK5uyP1O7sk9CGECcwKkvNrMbd9tq6bvvJtToCxfSwVsD82IkosCBOP7aAZAoOCKiiN0nNIGADgwOERHTEHJleAzZJerm3U0Ib0EP1zj5ZsJwbfFR5M8trFPCddjavqAzTD4MHyfgsPDJKn72PmIc8H0FmcrjTUtbjrz6nHyr13mzaQ4RQe1aZM_-6knaL26Wy8fiqfn-8flzVNhGWdDQWvjqKms9WUljFWCOq-Ik0JyBq4ywJ0T1jCQdcUMcXwBQtZQgmGyFIydoMvds9vYv48-DbprkvVta4Lvx6TpAuhCAlEkoxf_0E0_xpDNZYoABUZKmSm1o74XEX2ttzH_Ln5qAnrKQW_0Xw56ykHvcsjS858BYzX1foW_i2dfKgKH1A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2910203167</pqid></control><display><type>article</type><title>CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation</title><source>Access via ScienceDirect (Elsevier)</source><creator>Wang, Cheng ; Wang, Le ; Wang, Nuoqi ; Wei, Xiaoling ; Feng, Ting ; Wu, Minfeng ; Yao, Qi ; Zhang, Rongjun</creator><creatorcontrib>Wang, Cheng ; Wang, Le ; Wang, Nuoqi ; Wei, Xiaoling ; Feng, Ting ; Wu, Minfeng ; Yao, Qi ; Zhang, Rongjun</creatorcontrib><description>Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network structure, called CFATransUnet, with Transformer and CNN blocks as the backbone network, equipped with Channel-wise Cross Fusion Attention and Transformer (CCFAT) module, containing Channel-wise Cross Fusion Transformer (CCFT) and Channel-wise Cross Fusion Attention (CCFA). Specifically, we use a Transformer and CNN blocks to construct the encoder and decoder for adequate extraction and fusion of long-range and local semantic features. The CCFT module utilizes the self-attention mechanism to reintegrate semantic information from different stages into cross-level global features to reduce the semantic asymmetry between features at different levels. The CCFA module adaptively acquires the importance of each feature channel based on a global perspective in a network learning manner, enhancing effective information grasping and suppressing non-important features to mitigate semantic gaps. The combination of CCFT and CCFA can guide the effective fusion of different levels of features more powerfully with a global perspective. The consistent architecture of the encoder and decoder also alleviates the semantic gap. Experimental results suggest that the proposed CFATransUnet achieves state-of-the-art performance on four datasets. The code is available at https://github.com/CPU0808066/CFATransUnet.</description><identifier>ISSN: 0010-4825</identifier><identifier>EISSN: 1879-0534</identifier><identifier>DOI: 10.1016/j.compbiomed.2023.107803</identifier><identifier>PMID: 38064854</identifier><language>eng</language><publisher>United States: Elsevier Limited</publisher><subject>Coders ; Computer networks ; Decoding ; Image processing ; Image segmentation ; Medical imaging ; Medical research ; Modules ; Semantics</subject><ispartof>Computers in biology and medicine, 2024-01, Vol.168, p.107803-107803, Article 107803</ispartof><rights>Copyright © 2023 Elsevier Ltd. All rights reserved.</rights><rights>2023. Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c343t-2fad2abcce6b5ac852de81d757430dba04dd5ca307fb3a1d49057f060a376533</citedby><cites>FETCH-LOGICAL-c343t-2fad2abcce6b5ac852de81d757430dba04dd5ca307fb3a1d49057f060a376533</cites><orcidid>0000-0002-1798-9922</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,782,786,27933,27934</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38064854$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Cheng</creatorcontrib><creatorcontrib>Wang, Le</creatorcontrib><creatorcontrib>Wang, Nuoqi</creatorcontrib><creatorcontrib>Wei, Xiaoling</creatorcontrib><creatorcontrib>Feng, Ting</creatorcontrib><creatorcontrib>Wu, Minfeng</creatorcontrib><creatorcontrib>Yao, Qi</creatorcontrib><creatorcontrib>Zhang, Rongjun</creatorcontrib><title>CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation</title><title>Computers in biology and medicine</title><addtitle>Comput Biol Med</addtitle><description>Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network structure, called CFATransUnet, with Transformer and CNN blocks as the backbone network, equipped with Channel-wise Cross Fusion Attention and Transformer (CCFAT) module, containing Channel-wise Cross Fusion Transformer (CCFT) and Channel-wise Cross Fusion Attention (CCFA). Specifically, we use a Transformer and CNN blocks to construct the encoder and decoder for adequate extraction and fusion of long-range and local semantic features. The CCFT module utilizes the self-attention mechanism to reintegrate semantic information from different stages into cross-level global features to reduce the semantic asymmetry between features at different levels. The CCFA module adaptively acquires the importance of each feature channel based on a global perspective in a network learning manner, enhancing effective information grasping and suppressing non-important features to mitigate semantic gaps. The combination of CCFT and CCFA can guide the effective fusion of different levels of features more powerfully with a global perspective. The consistent architecture of the encoder and decoder also alleviates the semantic gap. Experimental results suggest that the proposed CFATransUnet achieves state-of-the-art performance on four datasets. The code is available at https://github.com/CPU0808066/CFATransUnet.</description><subject>Coders</subject><subject>Computer networks</subject><subject>Decoding</subject><subject>Image processing</subject><subject>Image segmentation</subject><subject>Medical imaging</subject><subject>Medical research</subject><subject>Modules</subject><subject>Semantics</subject><issn>0010-4825</issn><issn>1879-0534</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpdkUtLxDAUhYMoOj7-ggTcuOl482oy7mR0VBDcjOuQJql2aNMxaRH_vamjCK5uyP1O7sk9CGECcwKkvNrMbd9tq6bvvJtToCxfSwVsD82IkosCBOP7aAZAoOCKiiN0nNIGADgwOERHTEHJleAzZJerm3U0Ib0EP1zj5ZsJwbfFR5M8trFPCddjavqAzTD4MHyfgsPDJKn72PmIc8H0FmcrjTUtbjrz6nHyr13mzaQ4RQe1aZM_-6knaL26Wy8fiqfn-8flzVNhGWdDQWvjqKms9WUljFWCOq-Ik0JyBq4ywJ0T1jCQdcUMcXwBQtZQgmGyFIydoMvds9vYv48-DbprkvVta4Lvx6TpAuhCAlEkoxf_0E0_xpDNZYoABUZKmSm1o74XEX2ttzH_Ln5qAnrKQW_0Xw56ykHvcsjS858BYzX1foW_i2dfKgKH1A</recordid><startdate>202401</startdate><enddate>202401</enddate><creator>Wang, Cheng</creator><creator>Wang, Le</creator><creator>Wang, Nuoqi</creator><creator>Wei, Xiaoling</creator><creator>Feng, Ting</creator><creator>Wu, Minfeng</creator><creator>Yao, Qi</creator><creator>Zhang, Rongjun</creator><general>Elsevier Limited</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>K9.</scope><scope>M7Z</scope><scope>NAPCQ</scope><scope>P64</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-1798-9922</orcidid></search><sort><creationdate>202401</creationdate><title>CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation</title><author>Wang, Cheng ; Wang, Le ; Wang, Nuoqi ; Wei, Xiaoling ; Feng, Ting ; Wu, Minfeng ; Yao, Qi ; Zhang, Rongjun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c343t-2fad2abcce6b5ac852de81d757430dba04dd5ca307fb3a1d49057f060a376533</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Coders</topic><topic>Computer networks</topic><topic>Decoding</topic><topic>Image processing</topic><topic>Image segmentation</topic><topic>Medical imaging</topic><topic>Medical research</topic><topic>Modules</topic><topic>Semantics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Cheng</creatorcontrib><creatorcontrib>Wang, Le</creatorcontrib><creatorcontrib>Wang, Nuoqi</creatorcontrib><creatorcontrib>Wei, Xiaoling</creatorcontrib><creatorcontrib>Feng, Ting</creatorcontrib><creatorcontrib>Wu, Minfeng</creatorcontrib><creatorcontrib>Yao, Qi</creatorcontrib><creatorcontrib>Zhang, Rongjun</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Biochemistry Abstracts 1</collection><collection>Nursing &amp; Allied Health Premium</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Computers in biology and medicine</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Cheng</au><au>Wang, Le</au><au>Wang, Nuoqi</au><au>Wei, Xiaoling</au><au>Feng, Ting</au><au>Wu, Minfeng</au><au>Yao, Qi</au><au>Zhang, Rongjun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation</atitle><jtitle>Computers in biology and medicine</jtitle><addtitle>Comput Biol Med</addtitle><date>2024-01</date><risdate>2024</risdate><volume>168</volume><spage>107803</spage><epage>107803</epage><pages>107803-107803</pages><artnum>107803</artnum><issn>0010-4825</issn><eissn>1879-0534</eissn><abstract>Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network structure, called CFATransUnet, with Transformer and CNN blocks as the backbone network, equipped with Channel-wise Cross Fusion Attention and Transformer (CCFAT) module, containing Channel-wise Cross Fusion Transformer (CCFT) and Channel-wise Cross Fusion Attention (CCFA). Specifically, we use a Transformer and CNN blocks to construct the encoder and decoder for adequate extraction and fusion of long-range and local semantic features. The CCFT module utilizes the self-attention mechanism to reintegrate semantic information from different stages into cross-level global features to reduce the semantic asymmetry between features at different levels. The CCFA module adaptively acquires the importance of each feature channel based on a global perspective in a network learning manner, enhancing effective information grasping and suppressing non-important features to mitigate semantic gaps. The combination of CCFT and CCFA can guide the effective fusion of different levels of features more powerfully with a global perspective. The consistent architecture of the encoder and decoder also alleviates the semantic gap. Experimental results suggest that the proposed CFATransUnet achieves state-of-the-art performance on four datasets. The code is available at https://github.com/CPU0808066/CFATransUnet.</abstract><cop>United States</cop><pub>Elsevier Limited</pub><pmid>38064854</pmid><doi>10.1016/j.compbiomed.2023.107803</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-1798-9922</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0010-4825
ispartof Computers in biology and medicine, 2024-01, Vol.168, p.107803-107803, Article 107803
issn 0010-4825
1879-0534
language eng
recordid cdi_proquest_miscellaneous_2902970181
source Access via ScienceDirect (Elsevier)
subjects Coders
Computer networks
Decoding
Image processing
Image segmentation
Medical imaging
Medical research
Modules
Semantics
title CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-11-30T13%3A49%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CFATransUnet:%20Channel-wise%20cross%20fusion%20attention%20and%20transformer%20for%202D%20medical%20image%20segmentation&rft.jtitle=Computers%20in%20biology%20and%20medicine&rft.au=Wang,%20Cheng&rft.date=2024-01&rft.volume=168&rft.spage=107803&rft.epage=107803&rft.pages=107803-107803&rft.artnum=107803&rft.issn=0010-4825&rft.eissn=1879-0534&rft_id=info:doi/10.1016/j.compbiomed.2023.107803&rft_dat=%3Cproquest_cross%3E2910203167%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2910203167&rft_id=info:pmid/38064854&rfr_iscdi=true