CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation
Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network s...
Gespeichert in:
Veröffentlicht in: | Computers in biology and medicine 2024-01, Vol.168, p.107803-107803, Article 107803 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 107803 |
---|---|
container_issue | |
container_start_page | 107803 |
container_title | Computers in biology and medicine |
container_volume | 168 |
creator | Wang, Cheng Wang, Le Wang, Nuoqi Wei, Xiaoling Feng, Ting Wu, Minfeng Yao, Qi Zhang, Rongjun |
description | Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network structure, called CFATransUnet, with Transformer and CNN blocks as the backbone network, equipped with Channel-wise Cross Fusion Attention and Transformer (CCFAT) module, containing Channel-wise Cross Fusion Transformer (CCFT) and Channel-wise Cross Fusion Attention (CCFA). Specifically, we use a Transformer and CNN blocks to construct the encoder and decoder for adequate extraction and fusion of long-range and local semantic features. The CCFT module utilizes the self-attention mechanism to reintegrate semantic information from different stages into cross-level global features to reduce the semantic asymmetry between features at different levels. The CCFA module adaptively acquires the importance of each feature channel based on a global perspective in a network learning manner, enhancing effective information grasping and suppressing non-important features to mitigate semantic gaps. The combination of CCFT and CCFA can guide the effective fusion of different levels of features more powerfully with a global perspective. The consistent architecture of the encoder and decoder also alleviates the semantic gap. Experimental results suggest that the proposed CFATransUnet achieves state-of-the-art performance on four datasets. The code is available at https://github.com/CPU0808066/CFATransUnet. |
doi_str_mv | 10.1016/j.compbiomed.2023.107803 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2902970181</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2910203167</sourcerecordid><originalsourceid>FETCH-LOGICAL-c343t-2fad2abcce6b5ac852de81d757430dba04dd5ca307fb3a1d49057f060a376533</originalsourceid><addsrcrecordid>eNpdkUtLxDAUhYMoOj7-ggTcuOl482oy7mR0VBDcjOuQJql2aNMxaRH_vamjCK5uyP1O7sk9CGECcwKkvNrMbd9tq6bvvJtToCxfSwVsD82IkosCBOP7aAZAoOCKiiN0nNIGADgwOERHTEHJleAzZJerm3U0Ib0EP1zj5ZsJwbfFR5M8trFPCddjavqAzTD4MHyfgsPDJKn72PmIc8H0FmcrjTUtbjrz6nHyr13mzaQ4RQe1aZM_-6knaL26Wy8fiqfn-8flzVNhGWdDQWvjqKms9WUljFWCOq-Ik0JyBq4ywJ0T1jCQdcUMcXwBQtZQgmGyFIydoMvds9vYv48-DbprkvVta4Lvx6TpAuhCAlEkoxf_0E0_xpDNZYoABUZKmSm1o74XEX2ttzH_Ln5qAnrKQW_0Xw56ykHvcsjS858BYzX1foW_i2dfKgKH1A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2910203167</pqid></control><display><type>article</type><title>CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation</title><source>Access via ScienceDirect (Elsevier)</source><creator>Wang, Cheng ; Wang, Le ; Wang, Nuoqi ; Wei, Xiaoling ; Feng, Ting ; Wu, Minfeng ; Yao, Qi ; Zhang, Rongjun</creator><creatorcontrib>Wang, Cheng ; Wang, Le ; Wang, Nuoqi ; Wei, Xiaoling ; Feng, Ting ; Wu, Minfeng ; Yao, Qi ; Zhang, Rongjun</creatorcontrib><description>Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network structure, called CFATransUnet, with Transformer and CNN blocks as the backbone network, equipped with Channel-wise Cross Fusion Attention and Transformer (CCFAT) module, containing Channel-wise Cross Fusion Transformer (CCFT) and Channel-wise Cross Fusion Attention (CCFA). Specifically, we use a Transformer and CNN blocks to construct the encoder and decoder for adequate extraction and fusion of long-range and local semantic features. The CCFT module utilizes the self-attention mechanism to reintegrate semantic information from different stages into cross-level global features to reduce the semantic asymmetry between features at different levels. The CCFA module adaptively acquires the importance of each feature channel based on a global perspective in a network learning manner, enhancing effective information grasping and suppressing non-important features to mitigate semantic gaps. The combination of CCFT and CCFA can guide the effective fusion of different levels of features more powerfully with a global perspective. The consistent architecture of the encoder and decoder also alleviates the semantic gap. Experimental results suggest that the proposed CFATransUnet achieves state-of-the-art performance on four datasets. The code is available at https://github.com/CPU0808066/CFATransUnet.</description><identifier>ISSN: 0010-4825</identifier><identifier>EISSN: 1879-0534</identifier><identifier>DOI: 10.1016/j.compbiomed.2023.107803</identifier><identifier>PMID: 38064854</identifier><language>eng</language><publisher>United States: Elsevier Limited</publisher><subject>Coders ; Computer networks ; Decoding ; Image processing ; Image segmentation ; Medical imaging ; Medical research ; Modules ; Semantics</subject><ispartof>Computers in biology and medicine, 2024-01, Vol.168, p.107803-107803, Article 107803</ispartof><rights>Copyright © 2023 Elsevier Ltd. All rights reserved.</rights><rights>2023. Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c343t-2fad2abcce6b5ac852de81d757430dba04dd5ca307fb3a1d49057f060a376533</citedby><cites>FETCH-LOGICAL-c343t-2fad2abcce6b5ac852de81d757430dba04dd5ca307fb3a1d49057f060a376533</cites><orcidid>0000-0002-1798-9922</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,782,786,27933,27934</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38064854$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Cheng</creatorcontrib><creatorcontrib>Wang, Le</creatorcontrib><creatorcontrib>Wang, Nuoqi</creatorcontrib><creatorcontrib>Wei, Xiaoling</creatorcontrib><creatorcontrib>Feng, Ting</creatorcontrib><creatorcontrib>Wu, Minfeng</creatorcontrib><creatorcontrib>Yao, Qi</creatorcontrib><creatorcontrib>Zhang, Rongjun</creatorcontrib><title>CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation</title><title>Computers in biology and medicine</title><addtitle>Comput Biol Med</addtitle><description>Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network structure, called CFATransUnet, with Transformer and CNN blocks as the backbone network, equipped with Channel-wise Cross Fusion Attention and Transformer (CCFAT) module, containing Channel-wise Cross Fusion Transformer (CCFT) and Channel-wise Cross Fusion Attention (CCFA). Specifically, we use a Transformer and CNN blocks to construct the encoder and decoder for adequate extraction and fusion of long-range and local semantic features. The CCFT module utilizes the self-attention mechanism to reintegrate semantic information from different stages into cross-level global features to reduce the semantic asymmetry between features at different levels. The CCFA module adaptively acquires the importance of each feature channel based on a global perspective in a network learning manner, enhancing effective information grasping and suppressing non-important features to mitigate semantic gaps. The combination of CCFT and CCFA can guide the effective fusion of different levels of features more powerfully with a global perspective. The consistent architecture of the encoder and decoder also alleviates the semantic gap. Experimental results suggest that the proposed CFATransUnet achieves state-of-the-art performance on four datasets. The code is available at https://github.com/CPU0808066/CFATransUnet.</description><subject>Coders</subject><subject>Computer networks</subject><subject>Decoding</subject><subject>Image processing</subject><subject>Image segmentation</subject><subject>Medical imaging</subject><subject>Medical research</subject><subject>Modules</subject><subject>Semantics</subject><issn>0010-4825</issn><issn>1879-0534</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpdkUtLxDAUhYMoOj7-ggTcuOl482oy7mR0VBDcjOuQJql2aNMxaRH_vamjCK5uyP1O7sk9CGECcwKkvNrMbd9tq6bvvJtToCxfSwVsD82IkosCBOP7aAZAoOCKiiN0nNIGADgwOERHTEHJleAzZJerm3U0Ib0EP1zj5ZsJwbfFR5M8trFPCddjavqAzTD4MHyfgsPDJKn72PmIc8H0FmcrjTUtbjrz6nHyr13mzaQ4RQe1aZM_-6knaL26Wy8fiqfn-8flzVNhGWdDQWvjqKms9WUljFWCOq-Ik0JyBq4ywJ0T1jCQdcUMcXwBQtZQgmGyFIydoMvds9vYv48-DbprkvVta4Lvx6TpAuhCAlEkoxf_0E0_xpDNZYoABUZKmSm1o74XEX2ttzH_Ln5qAnrKQW_0Xw56ykHvcsjS858BYzX1foW_i2dfKgKH1A</recordid><startdate>202401</startdate><enddate>202401</enddate><creator>Wang, Cheng</creator><creator>Wang, Le</creator><creator>Wang, Nuoqi</creator><creator>Wei, Xiaoling</creator><creator>Feng, Ting</creator><creator>Wu, Minfeng</creator><creator>Yao, Qi</creator><creator>Zhang, Rongjun</creator><general>Elsevier Limited</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>K9.</scope><scope>M7Z</scope><scope>NAPCQ</scope><scope>P64</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-1798-9922</orcidid></search><sort><creationdate>202401</creationdate><title>CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation</title><author>Wang, Cheng ; Wang, Le ; Wang, Nuoqi ; Wei, Xiaoling ; Feng, Ting ; Wu, Minfeng ; Yao, Qi ; Zhang, Rongjun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c343t-2fad2abcce6b5ac852de81d757430dba04dd5ca307fb3a1d49057f060a376533</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Coders</topic><topic>Computer networks</topic><topic>Decoding</topic><topic>Image processing</topic><topic>Image segmentation</topic><topic>Medical imaging</topic><topic>Medical research</topic><topic>Modules</topic><topic>Semantics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Cheng</creatorcontrib><creatorcontrib>Wang, Le</creatorcontrib><creatorcontrib>Wang, Nuoqi</creatorcontrib><creatorcontrib>Wei, Xiaoling</creatorcontrib><creatorcontrib>Feng, Ting</creatorcontrib><creatorcontrib>Wu, Minfeng</creatorcontrib><creatorcontrib>Yao, Qi</creatorcontrib><creatorcontrib>Zhang, Rongjun</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Biochemistry Abstracts 1</collection><collection>Nursing & Allied Health Premium</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Computers in biology and medicine</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Cheng</au><au>Wang, Le</au><au>Wang, Nuoqi</au><au>Wei, Xiaoling</au><au>Feng, Ting</au><au>Wu, Minfeng</au><au>Yao, Qi</au><au>Zhang, Rongjun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation</atitle><jtitle>Computers in biology and medicine</jtitle><addtitle>Comput Biol Med</addtitle><date>2024-01</date><risdate>2024</risdate><volume>168</volume><spage>107803</spage><epage>107803</epage><pages>107803-107803</pages><artnum>107803</artnum><issn>0010-4825</issn><eissn>1879-0534</eissn><abstract>Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network structure, called CFATransUnet, with Transformer and CNN blocks as the backbone network, equipped with Channel-wise Cross Fusion Attention and Transformer (CCFAT) module, containing Channel-wise Cross Fusion Transformer (CCFT) and Channel-wise Cross Fusion Attention (CCFA). Specifically, we use a Transformer and CNN blocks to construct the encoder and decoder for adequate extraction and fusion of long-range and local semantic features. The CCFT module utilizes the self-attention mechanism to reintegrate semantic information from different stages into cross-level global features to reduce the semantic asymmetry between features at different levels. The CCFA module adaptively acquires the importance of each feature channel based on a global perspective in a network learning manner, enhancing effective information grasping and suppressing non-important features to mitigate semantic gaps. The combination of CCFT and CCFA can guide the effective fusion of different levels of features more powerfully with a global perspective. The consistent architecture of the encoder and decoder also alleviates the semantic gap. Experimental results suggest that the proposed CFATransUnet achieves state-of-the-art performance on four datasets. The code is available at https://github.com/CPU0808066/CFATransUnet.</abstract><cop>United States</cop><pub>Elsevier Limited</pub><pmid>38064854</pmid><doi>10.1016/j.compbiomed.2023.107803</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-1798-9922</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0010-4825 |
ispartof | Computers in biology and medicine, 2024-01, Vol.168, p.107803-107803, Article 107803 |
issn | 0010-4825 1879-0534 |
language | eng |
recordid | cdi_proquest_miscellaneous_2902970181 |
source | Access via ScienceDirect (Elsevier) |
subjects | Coders Computer networks Decoding Image processing Image segmentation Medical imaging Medical research Modules Semantics |
title | CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-11-30T13%3A49%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CFATransUnet:%20Channel-wise%20cross%20fusion%20attention%20and%20transformer%20for%202D%20medical%20image%20segmentation&rft.jtitle=Computers%20in%20biology%20and%20medicine&rft.au=Wang,%20Cheng&rft.date=2024-01&rft.volume=168&rft.spage=107803&rft.epage=107803&rft.pages=107803-107803&rft.artnum=107803&rft.issn=0010-4825&rft.eissn=1879-0534&rft_id=info:doi/10.1016/j.compbiomed.2023.107803&rft_dat=%3Cproquest_cross%3E2910203167%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2910203167&rft_id=info:pmid/38064854&rfr_iscdi=true |