nnY-Net: Swin-NeXt with Cross-Attention for 3D Medical Images Segmentation
This paper provides a novel 3D medical image segmentation model structure called nnY-Net. This name comes from the fact that our model adds a cross-attention module at the bottom of the U-net structure to form a Y structure. We integrate the advantages of the two latest SOTA models, MedNeXt and Swin...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper provides a novel 3D medical image segmentation model structure
called nnY-Net. This name comes from the fact that our model adds a
cross-attention module at the bottom of the U-net structure to form a Y
structure. We integrate the advantages of the two latest SOTA models, MedNeXt
and SwinUNETR, and use Swin Transformer as the encoder and ConvNeXt as the
decoder to innovatively design the Swin-NeXt structure. Our model uses the
lowest-level feature map of the encoder as Key and Value and uses patient
features such as pathology and treatment information as Query to calculate the
attention weights in a Cross Attention module. Moreover, we simplify some pre-
and post-processing as well as data enhancement methods in 3D image
segmentation based on the dynUnet and nnU-net frameworks. We integrate our
proposed Swin-NeXt with Cross-Attention framework into this framework. Last, we
construct a DiceFocalCELoss to improve the training efficiency for the uneven
data convergence of voxel classification. |
---|---|
DOI: | 10.48550/arxiv.2501.01406 |