Swin-TransUper: Swin Transformer-based UperNet for medical image segmentation

Convolutional Neural Network-based UNet and its variants have shown remarkable performance in medical image segmentation. However, these methods can only capture local features without spatial correlations and are incapable of global modeling. Previous studies prove that local and global features ar...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2024, Vol.83 (42), p.89817-89836
Hauptverfasser: Yin, Jianjian, Chen, Yi, Li, Chengyu, Zheng, Zhichao, Gu, Yanhui, Zhou, Junsheng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Convolutional Neural Network-based UNet and its variants have shown remarkable performance in medical image segmentation. However, these methods can only capture local features without spatial correlations and are incapable of global modeling. Previous studies prove that local and global features are critical in computer vision. Therefore, based on the abovementioned considerations, this paper proposes a pure Transformer model named Swin-TransUper. Firstly, we explore extending UperNet by incorporating the hierarchical Swin Transformer with shifted windows, thereby enhancing the global modeling capability of the model. Secondly, we introduce an SPPM (Swin Pyramid Pooling Module) to conduct multi-scale feature mining on the deepest features generated by the encoder, fully considering the semantic information of the deepest features. Finally, the multi-scale attention module aggregates the multi-scale feature information to obtain a more refined feature map. Our method achieves the state-of-the-art performance of 80.08%, 90.25%, and 90.62% on the Synapse multi-organ segmentation, ISIC2017, and ACDC datasets based on the DSC (Dice Similarity Coefficient) metric. At the same time, experimental results on the ISIC2017 dataset show that Swin-TransUper achieves the best performance on Sensitivity and Accuracy metrics of 91.20% and 96.44%, respectively. Our code is available at https://github.com/JianJianYin/Swin-TransUper .
ISSN:1573-7721
1380-7501
1573-7721
DOI:10.1007/s11042-024-19009-x