FTUNet: A Feature-Enhanced Network for Medical Image Segmentation Based on the Combination of U-Shaped Network and Vision Transformer

Semantic Segmentation has been widely used in a variety of clinical images, which greatly assists medical diagnosis and other work. To address the challenge of reduced semantic inference accuracy caused by feature weakening, a pioneering network called FTUNet (Feature-enhanced Transformer UNet) was...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural processing letters 2024-03, Vol.56 (2), p.83, Article 83
Hauptverfasser:	Wang, Yuefei, Yu, Xi, Yang, Yixi, Zeng, Shijie, Xu, Yuquan, Feng, Ronghui
Format:	Artikel
Sprache:	eng
Schlagworte:	Ablation Amplification Artificial Intelligence Classification Complex Systems Computational Intelligence Computer Science Convolution Deep learning Encoders-Decoders Feature extraction Image enhancement Image segmentation Medical imaging Neural networks Semantic segmentation Semantics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	2
container_start_page	83
container_title	Neural processing letters
container_volume	56
creator	Wang, Yuefei Yu, Xi Yang, Yixi Zeng, Shijie Xu, Yuquan Feng, Ronghui
description	Semantic Segmentation has been widely used in a variety of clinical images, which greatly assists medical diagnosis and other work. To address the challenge of reduced semantic inference accuracy caused by feature weakening, a pioneering network called FTUNet (Feature-enhanced Transformer UNet) was introduced, leveraging the classical Encoder-Decoder architecture. Firstly, a dual-branch Encoder is proposed based on the U-shaped structure. In addition to employing convolution for feature extraction, a Layer Transformer structure (LTrans) is established to capture long-range dependencies and global context information. Then, an Inception structural module focusing on local features is proposed at the Bottleneck, which adopts the dilated convolution to amplify the receptive field to achieve deeper semantic mining based on the comprehensive information brought by the dual Encoder. Finally, in order to amplify feature differences, a lightweight attention mechanism of feature polarization is proposed at Skip Connection, which can strengthen or suppress feature channels by reallocating weights. The experiment is conducted on 3 different medical datasets. A comprehensive and detailed comparison was conducted with 6 non-U-shaped models, 5 U-shaped models, and 3 Transformer models in 8 categories of indicators. Meanwhile, 9 kinds of layer-by-layer ablation and 4 kinds of other embedding attempts are implemented to demonstrate the optimal structure of the current FTUNet.
doi_str_mv	10.1007/s11063-024-11533-z
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2937158267</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2937158267</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-3bb2e52d725beb32f098af86d5cbe3be1313508034f8501d22387d188bbc49d23</originalsourceid><addsrcrecordid>eNp9kEFPAjEQhTdGExH9A56aeK62HZZdvCEBJUE9AMZb0-7OwiLbYrvEyN3_bXFN5ORpJvPe-yZ5UXTJ2TVnLLnxnLMuUCY6lPMYgO6OohaPE6BJAq_HB_tpdOb9irEQE6wVfY1m8yesb0mfjFDVW4d0aJbKZJiTcP-w7o0U1pFHzMtMrcm4UgskU1xUaGpVl9aQO-WDOSz1EsnAVro0jWALMqfTpdocsJTJyUvp9_LMKeMDu0J3Hp0Uau3x4ne2o_loOBs80Mnz_XjQn9AMulBT0FpgLPJExBo1iIL1UlWk3TzONIJGDhxiljLoFGnMeC4EpEnO01TrrNPLBbSjq4a7cfZ9i76WK7t1JryUogcJj1PRTYJLNK7MWe8dFnLjykq5T8mZ3Nctm7plqFv-1C13IQRNyAezWaD7Q_-T-gZfEYOW</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2937158267</pqid></control><display><type>article</type><title>FTUNet: A Feature-Enhanced Network for Medical Image Segmentation Based on the Combination of U-Shaped Network and Vision Transformer</title><source>Springer Nature OA/Free Journals</source><source>SpringerLink Journals - AutoHoldings</source><creator>Wang, Yuefei ; Yu, Xi ; Yang, Yixi ; Zeng, Shijie ; Xu, Yuquan ; Feng, Ronghui</creator><creatorcontrib>Wang, Yuefei ; Yu, Xi ; Yang, Yixi ; Zeng, Shijie ; Xu, Yuquan ; Feng, Ronghui</creatorcontrib><description>Semantic Segmentation has been widely used in a variety of clinical images, which greatly assists medical diagnosis and other work. To address the challenge of reduced semantic inference accuracy caused by feature weakening, a pioneering network called FTUNet (Feature-enhanced Transformer UNet) was introduced, leveraging the classical Encoder-Decoder architecture. Firstly, a dual-branch Encoder is proposed based on the U-shaped structure. In addition to employing convolution for feature extraction, a Layer Transformer structure (LTrans) is established to capture long-range dependencies and global context information. Then, an Inception structural module focusing on local features is proposed at the Bottleneck, which adopts the dilated convolution to amplify the receptive field to achieve deeper semantic mining based on the comprehensive information brought by the dual Encoder. Finally, in order to amplify feature differences, a lightweight attention mechanism of feature polarization is proposed at Skip Connection, which can strengthen or suppress feature channels by reallocating weights. The experiment is conducted on 3 different medical datasets. A comprehensive and detailed comparison was conducted with 6 non-U-shaped models, 5 U-shaped models, and 3 Transformer models in 8 categories of indicators. Meanwhile, 9 kinds of layer-by-layer ablation and 4 kinds of other embedding attempts are implemented to demonstrate the optimal structure of the current FTUNet.</description><identifier>ISSN: 1573-773X</identifier><identifier>ISSN: 1370-4621</identifier><identifier>EISSN: 1573-773X</identifier><identifier>DOI: 10.1007/s11063-024-11533-z</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Ablation ; Amplification ; Artificial Intelligence ; Classification ; Complex Systems ; Computational Intelligence ; Computer Science ; Convolution ; Deep learning ; Encoders-Decoders ; Feature extraction ; Image enhancement ; Image segmentation ; Medical imaging ; Neural networks ; Semantic segmentation ; Semantics</subject><ispartof>Neural processing letters, 2024-03, Vol.56 (2), p.83, Article 83</ispartof><rights>The Author(s) 2024</rights><rights>The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-3bb2e52d725beb32f098af86d5cbe3be1313508034f8501d22387d188bbc49d23</citedby><cites>FETCH-LOGICAL-c363t-3bb2e52d725beb32f098af86d5cbe3be1313508034f8501d22387d188bbc49d23</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11063-024-11533-z$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://doi.org/10.1007/s11063-024-11533-z$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>315,781,785,27929,27930,41125,41493,42194,42562,51324,51581</link.rule.ids></links><search><creatorcontrib>Wang, Yuefei</creatorcontrib><creatorcontrib>Yu, Xi</creatorcontrib><creatorcontrib>Yang, Yixi</creatorcontrib><creatorcontrib>Zeng, Shijie</creatorcontrib><creatorcontrib>Xu, Yuquan</creatorcontrib><creatorcontrib>Feng, Ronghui</creatorcontrib><title>FTUNet: A Feature-Enhanced Network for Medical Image Segmentation Based on the Combination of U-Shaped Network and Vision Transformer</title><title>Neural processing letters</title><addtitle>Neural Process Lett</addtitle><description>Semantic Segmentation has been widely used in a variety of clinical images, which greatly assists medical diagnosis and other work. To address the challenge of reduced semantic inference accuracy caused by feature weakening, a pioneering network called FTUNet (Feature-enhanced Transformer UNet) was introduced, leveraging the classical Encoder-Decoder architecture. Firstly, a dual-branch Encoder is proposed based on the U-shaped structure. In addition to employing convolution for feature extraction, a Layer Transformer structure (LTrans) is established to capture long-range dependencies and global context information. Then, an Inception structural module focusing on local features is proposed at the Bottleneck, which adopts the dilated convolution to amplify the receptive field to achieve deeper semantic mining based on the comprehensive information brought by the dual Encoder. Finally, in order to amplify feature differences, a lightweight attention mechanism of feature polarization is proposed at Skip Connection, which can strengthen or suppress feature channels by reallocating weights. The experiment is conducted on 3 different medical datasets. A comprehensive and detailed comparison was conducted with 6 non-U-shaped models, 5 U-shaped models, and 3 Transformer models in 8 categories of indicators. Meanwhile, 9 kinds of layer-by-layer ablation and 4 kinds of other embedding attempts are implemented to demonstrate the optimal structure of the current FTUNet.</description><subject>Ablation</subject><subject>Amplification</subject><subject>Artificial Intelligence</subject><subject>Classification</subject><subject>Complex Systems</subject><subject>Computational Intelligence</subject><subject>Computer Science</subject><subject>Convolution</subject><subject>Deep learning</subject><subject>Encoders-Decoders</subject><subject>Feature extraction</subject><subject>Image enhancement</subject><subject>Image segmentation</subject><subject>Medical imaging</subject><subject>Neural networks</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><issn>1573-773X</issn><issn>1370-4621</issn><issn>1573-773X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><recordid>eNp9kEFPAjEQhTdGExH9A56aeK62HZZdvCEBJUE9AMZb0-7OwiLbYrvEyN3_bXFN5ORpJvPe-yZ5UXTJ2TVnLLnxnLMuUCY6lPMYgO6OohaPE6BJAq_HB_tpdOb9irEQE6wVfY1m8yesb0mfjFDVW4d0aJbKZJiTcP-w7o0U1pFHzMtMrcm4UgskU1xUaGpVl9aQO-WDOSz1EsnAVro0jWALMqfTpdocsJTJyUvp9_LMKeMDu0J3Hp0Uau3x4ne2o_loOBs80Mnz_XjQn9AMulBT0FpgLPJExBo1iIL1UlWk3TzONIJGDhxiljLoFGnMeC4EpEnO01TrrNPLBbSjq4a7cfZ9i76WK7t1JryUogcJj1PRTYJLNK7MWe8dFnLjykq5T8mZ3Nctm7plqFv-1C13IQRNyAezWaD7Q_-T-gZfEYOW</recordid><startdate>20240304</startdate><enddate>20240304</enddate><creator>Wang, Yuefei</creator><creator>Yu, Xi</creator><creator>Yang, Yixi</creator><creator>Zeng, Shijie</creator><creator>Xu, Yuquan</creator><creator>Feng, Ronghui</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope></search><sort><creationdate>20240304</creationdate><title>FTUNet: A Feature-Enhanced Network for Medical Image Segmentation Based on the Combination of U-Shaped Network and Vision Transformer</title><author>Wang, Yuefei ; Yu, Xi ; Yang, Yixi ; Zeng, Shijie ; Xu, Yuquan ; Feng, Ronghui</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-3bb2e52d725beb32f098af86d5cbe3be1313508034f8501d22387d188bbc49d23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Ablation</topic><topic>Amplification</topic><topic>Artificial Intelligence</topic><topic>Classification</topic><topic>Complex Systems</topic><topic>Computational Intelligence</topic><topic>Computer Science</topic><topic>Convolution</topic><topic>Deep learning</topic><topic>Encoders-Decoders</topic><topic>Feature extraction</topic><topic>Image enhancement</topic><topic>Image segmentation</topic><topic>Medical imaging</topic><topic>Neural networks</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Yuefei</creatorcontrib><creatorcontrib>Yu, Xi</creatorcontrib><creatorcontrib>Yang, Yixi</creatorcontrib><creatorcontrib>Zeng, Shijie</creatorcontrib><creatorcontrib>Xu, Yuquan</creatorcontrib><creatorcontrib>Feng, Ronghui</creatorcontrib><collection>Springer Nature OA/Free Journals</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>Neural processing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Yuefei</au><au>Yu, Xi</au><au>Yang, Yixi</au><au>Zeng, Shijie</au><au>Xu, Yuquan</au><au>Feng, Ronghui</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>FTUNet: A Feature-Enhanced Network for Medical Image Segmentation Based on the Combination of U-Shaped Network and Vision Transformer</atitle><jtitle>Neural processing letters</jtitle><stitle>Neural Process Lett</stitle><date>2024-03-04</date><risdate>2024</risdate><volume>56</volume><issue>2</issue><spage>83</spage><pages>83-</pages><artnum>83</artnum><issn>1573-773X</issn><issn>1370-4621</issn><eissn>1573-773X</eissn><abstract>Semantic Segmentation has been widely used in a variety of clinical images, which greatly assists medical diagnosis and other work. To address the challenge of reduced semantic inference accuracy caused by feature weakening, a pioneering network called FTUNet (Feature-enhanced Transformer UNet) was introduced, leveraging the classical Encoder-Decoder architecture. Firstly, a dual-branch Encoder is proposed based on the U-shaped structure. In addition to employing convolution for feature extraction, a Layer Transformer structure (LTrans) is established to capture long-range dependencies and global context information. Then, an Inception structural module focusing on local features is proposed at the Bottleneck, which adopts the dilated convolution to amplify the receptive field to achieve deeper semantic mining based on the comprehensive information brought by the dual Encoder. Finally, in order to amplify feature differences, a lightweight attention mechanism of feature polarization is proposed at Skip Connection, which can strengthen or suppress feature channels by reallocating weights. The experiment is conducted on 3 different medical datasets. A comprehensive and detailed comparison was conducted with 6 non-U-shaped models, 5 U-shaped models, and 3 Transformer models in 8 categories of indicators. Meanwhile, 9 kinds of layer-by-layer ablation and 4 kinds of other embedding attempts are implemented to demonstrate the optimal structure of the current FTUNet.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11063-024-11533-z</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1573-773X
ispartof	Neural processing letters, 2024-03, Vol.56 (2), p.83, Article 83
issn	1573-773X 1370-4621 1573-773X
language	eng
recordid	cdi_proquest_journals_2937158267
source	Springer Nature OA/Free Journals; SpringerLink Journals - AutoHoldings
subjects	Ablation Amplification Artificial Intelligence Classification Complex Systems Computational Intelligence Computer Science Convolution Deep learning Encoders-Decoders Feature extraction Image enhancement Image segmentation Medical imaging Neural networks Semantic segmentation Semantics
title	FTUNet: A Feature-Enhanced Network for Medical Image Segmentation Based on the Combination of U-Shaped Network and Vision Transformer
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-12T01%3A54%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=FTUNet:%20A%20Feature-Enhanced%20Network%20for%20Medical%20Image%20Segmentation%20Based%20on%20the%20Combination%20of%20U-Shaped%20Network%20and%20Vision%20Transformer&rft.jtitle=Neural%20processing%20letters&rft.au=Wang,%20Yuefei&rft.date=2024-03-04&rft.volume=56&rft.issue=2&rft.spage=83&rft.pages=83-&rft.artnum=83&rft.issn=1573-773X&rft.eissn=1573-773X&rft_id=info:doi/10.1007/s11063-024-11533-z&rft_dat=%3Cproquest_cross%3E2937158267%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2937158267&rft_id=info:pmid/&rfr_iscdi=true