DMA‐Net: A dual branch encoder and multi‐scale cross attention fusion network for skin lesion segmentation
Automatic segmentation of skin lesion is an important step in computer‐aided diagnosis. However, due to the significant variations in the size and shape of the lesion areas, as well as the low contrast with normal skin tissue, the boundaries are not clearly distinguishable, leading to a high possibi...
Gespeichert in:
Veröffentlicht in: | IET image processing 2024-12, Vol.18 (14), p.4531-4541 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Automatic segmentation of skin lesion is an important step in computer‐aided diagnosis. However, due to the significant variations in the size and shape of the lesion areas, as well as the low contrast with normal skin tissue, the boundaries are not clearly distinguishable, leading to a high possibility of incorrect segmentation. Therefore, this task is highly challenging. To overcome these difficulties, this paper proposes a medical image segmentation architecture named dual branch encoder and multi‐scale cross attention fusion network, which includes a dual‐branch encoder based on convolutional neural network and an improved channel‐enhanced Mamba to comprehensively extract local and global information from dermoscopy images. Additionally, to enhance the feature interaction and fusion of local and global information, a multi‐scale cross attention fusion module is adopted to cross‐merge features in different directions and at different scales, maximizing the advantages of the dual‐branch encoder and achieving precise segmentation of skin lesions. Extensive experiments are conducted on three public skin lesion datasets: ISIC‐2018, ISIC‐2017, and ISIC‐2016, to verify the effectiveness and superiority of the proposed method. The dice similarity coefficient scores on the three datasets reached 81.77%, 81.68% and 85.60%, respectively, surpassing most state‐of‐the‐art methods.
(1) This paper proposes a channel enhanced omnidirectional selective scan module (COSSM) block that integrates channel attention and convolutional feed‐forward network into the Mamba block, enhancing its channel information interaction capability. Based on this, we replace the traditional single‐branch encoder architecture with COSSM and convolutional neural network (CNN) to construct a dual‐branch encoder network, dual branch encoder and multi‐scale cross attention fusion network (DMA‐Net). By fully integrating feature information from both branches, DMA‐Net can extract rich local features and capture the important global contextual information for skin lesion segmentation.
(2) This paper proposes an multi scale cross attention fusion module. This module uses multi‐scale one‐dimensional convolution to extract features from the horizontal and vertical directions of CNN and Mamba, and then uses cross‐attention to interact with features in different directions from the two branches. This module achieves full integration of the features from the two branches, maximizing the advantages of the |
---|---|
ISSN: | 1751-9659 1751-9667 |
DOI: | 10.1049/ipr2.13265 |