Medical lesion segmentation by combining multimodal images with modality weighted UNet

Purpose Automatic segmentation of medical lesions is a prerequisite for efficient clinic analysis. Segmentation algorithms for multimodal medical images have received much attention in recent years. Different strategies for multimodal combination (or fusion), such as probability theory, fuzzy models...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Medical physics (Lancaster) 2022-06, Vol.49 (6), p.3692-3704
Hauptverfasser: Zhu, Xiner, Wu, Yichao, Hu, Haoji, Zhuang, Xianwei, Yao, Jincao, Ou, Di, Li, Wei, Song, Mei, Feng, Na, Xu, Dong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Purpose Automatic segmentation of medical lesions is a prerequisite for efficient clinic analysis. Segmentation algorithms for multimodal medical images have received much attention in recent years. Different strategies for multimodal combination (or fusion), such as probability theory, fuzzy models, belief functions, and deep neural networks, have also been developed. In this paper, we propose the modality weighted UNet (MW‐UNet) and attention‐based fusion method to combine multimodal images for medical lesion segmentation. Methods MW‐UNet is a multimodal fusion method which is based on UNet, but we use a shallower layer and fewer feature map channels to reduce the amount of network parameters, and our method uses the new multimodal fusion method called fusion attention. It uses weighted sum rule and fusion attention to combine feature maps in intermediate layers. During training, all the weight parameters are updated through backpropagation like other parameters in the network. We also incorporate residual blocks into MW‐UNet to further improve segmentation performance. The comparison between the automatic multimodal lesion segmentations and the manual contours was quantified by (1) five metrics including Dice, 95% Hausdorff Distance (HD95), volumetric overlap error (VOE), relative volume difference (RVD), and mean‐Intersection‐over‐Union (mIoU); (2) Number of parameters and flops to calculate the complexity of the network. Results The proposed method is verified on ZJCHD, which is the data set of contrast‐enhanced computed tomography (CECT) for Liver Lesion Segmentation taken from Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Hangzhou, China. For accuracy evaluation, we use 120 patients with liver lesions from ZJCHD, of which 100 are used for fourfold cross‐validation (CV) and 20 are used for hold‐out (HO) test. The mean Dice was 90.55±14.44%$90.55 \pm 14.44\%$ and 89.31±19.07%$89.31 \pm 19.07\%$ for HO and CV tests, respectively. The corresponding HD95, VOE, RVD, and mIoU of the two tests are 1.95 ± 1.83 and 2.67 ± 3.35 mm, 13.11 ± 15.83 and 13.13±18.52%$13.13 \pm 18.52 \%$, 12.20 ± 18.20 and 13.00±21.82%$13.00 \pm 21.82 \%$, and 83.79 ± 15.83 and 82.35±20.03%$82.35 \pm 20.03 \%$. The parameters and flops of our method is 4.04 M and 18.36 G, respectively. Conclusions The results show that our method performs well on multimodal liver lesion segmentation. It can be easily extended to other multimodal data se
ISSN:0094-2405
2473-4209
DOI:10.1002/mp.15610