ParaTransCNN: Parallelized TransCNN Encoder for Medical Image Segmentation
The convolutional neural network-based methods have become more and more popular for medical image segmentation due to their outstanding performance. However, they struggle with capturing long-range dependencies, which are essential for accurately modeling global contextual correlations. Thanks to t...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The convolutional neural network-based methods have become more and more
popular for medical image segmentation due to their outstanding performance.
However, they struggle with capturing long-range dependencies, which are
essential for accurately modeling global contextual correlations. Thanks to the
ability to model long-range dependencies by expanding the receptive field, the
transformer-based methods have gained prominence. Inspired by this, we propose
an advanced 2D feature extraction method by combining the convolutional neural
network and Transformer architectures. More specifically, we introduce a
parallelized encoder structure, where one branch uses ResNet to extract local
information from images, while the other branch uses Transformer to extract
global information. Furthermore, we integrate pyramid structures into the
Transformer to extract global information at varying resolutions, especially in
intensive prediction tasks. To efficiently utilize the different information in
the parallelized encoder at the decoder stage, we use a channel attention
module to merge the features of the encoder and propagate them through skip
connections and bottlenecks. Intensive numerical experiments are performed on
both aortic vessel tree, cardiac, and multi-organ datasets. By comparing with
state-of-the-art medical image segmentation methods, our method is shown with
better segmentation accuracy, especially on small organs. The code is publicly
available on https://github.com/HongkunSun/ParaTransCNN. |
---|---|
DOI: | 10.48550/arxiv.2401.15307 |