GCFormer: Multi-scale feature plays a crucial role in medical images segmentation

Transformer-based networks are becoming indispensable in the field of medical image segmentation. However, most Transformer-based methods overlook the impact of different scale features on encoding efficiency and neglect the fusion and further processing of multi-scale features. Due to the lack of l...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Knowledge-based systems 2024-09, Vol.300, p.112170, Article 112170
Hauptverfasser:	Feng, Yuncong, Cong, Yeming, Xing, Shuaijie, Wang, Hairui, Ren, Zihang, Zhang, Xiaoli
Format:	Artikel
Sprache:	eng
Schlagworte:	Attention Global context vision transformer Medical image segmentation Multi-scale feature TransUnet
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Transformer-based networks are becoming indispensable in the field of medical image segmentation. However, most Transformer-based methods overlook the impact of different scale features on encoding efficiency and neglect the fusion and further processing of multi-scale features. Due to the lack of learning different scale features, the reduction of noise interference is not effectively achieved. Consequently, there is an inability to distinctly differentiate between the target segmented area and the surrounding tissues, leading to subpar segmentation results. This issue is particularly pronounced when dealing with multiple segmented areas, where the edges of various organs and tissues cannot be well identified. To enhance learning diversity, we propose the Global Context Transformer (GCFormer), a medical image segmentation network that combines Transformers with CNN. In our proposed network, a novel multi-scale feature processing mechanism is adopted to reasonably encode and decode features at different scales, thereby improving segmentation efficiency. We employ the Global Token Generator (GTG) module to filter and partition multi-scale features, extracting useful information. The encoder incorporates the Pass Down module to fuse multi-scale information, while the decoder efficiently concatenates different-scale features using the Cat module. Experimental results indicate that our proposed algorithm surpasses other mainstream methods in terms of segmentation capability. •We design a novel medical image segmentation network that combines CNN and Transformers, to capture both local feature extraction and long-range information dependencies.•We utilize filtering and extraction to effectively filter out noise information, thereby enhancing encoding efficiency.•We adopt a novel approach to fuse features of different scales, enabling the network to better detect targeted organs from similar background.
ISSN:	0950-7051
DOI:	10.1016/j.knosys.2024.112170