A parallelly contextual convolutional transformer for medical image segmentation

Hybrid architectures based on Convolutional Neural Networks (CNN) and Transformers have been extensively employed in medical image segmentation. However, previous studies have encountered difficulties in effectively combining global and local features or fully exploiting the rich context, leading to...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Biomedical signal processing and control 2024-12, Vol.98, p.106674, Article 106674
Hauptverfasser:	Feng, Yuncong, Su, Jianyu, Zheng, Jian, Zheng, Yupeng, Zhang, Xiaoli
Format:	Artikel
Sprache:	eng
Schlagworte:	Convolutional neural networks Deep learning Medical image segmentation Parallel architecture Transformer
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Hybrid architectures based on Convolutional Neural Networks (CNN) and Transformers have been extensively employed in medical image segmentation. However, previous studies have encountered difficulties in effectively combining global and local features or fully exploiting the rich context, leading to suboptimal segmentation. To address this shortcoming, this paper proposes the Parallel Contextual Convolutional Transformer (PCCTrans), whose encoder–decoder consists of Contextual Transformer & Convolution (CoT&Conv) and Fully Convolutional Transformer & Convolution (FCT&Conv) parallel hybrid modules. The proposed Multi-scale Fusion Output (MSF) module and channel-attention skipping connection are utilized to obtain better segmentation performance. Specifically, PCCTrans follows a U-shaped encoder–decoder design with a shallow CoT block that harnesses the contextual information among the input keys to guide the learning of dynamic attention matrix, thereby enhancing the acquisition of global information. The deeper improved FCT block effectively understands the fine-grained nature of the segmentation task and captures long-term dependencies in the inputs. At the end of the decoder, we propose the MSF module, which fuses the features learned by the model to enhance segmentation. The experimental results demonstrate that PCCTrans outperforms existing state-of-the-art models in the Synapse Multi-Organ Segmentation and Automated Cardiac Diagnosis Challenge (ACDC) datasets without any pre-training. On the Dice metric, PCCTrans outperforms its direct competitors by 1.37% on the Synapse dataset and 0.66% on the ACDC dataset, with up to a threefold fewer parameters. It is worth mentioning that the method in this paper achieved superior evaluation metrics and segmentation results in four other colour RGB medical image datasets. •A novel PCCTrans using parallel blocks is proposed for medical image segmentation.•Parallel hybrid blocks effectively blend coarse- and fine-grained features.•Multi-scale fusion effectively improves global information and segmentation details.•The PCCTrans has fewer parameters and does not require any pre-training.
ISSN:	1746-8094
DOI:	10.1016/j.bspc.2024.106674