Cross‐shaped windows transformer with self‐supervised pretraining for clinically significant prostate cancer detection in bi‐parametric MRI

Background Bi‐parametric magnetic resonance imaging (bpMRI) has demonstrated promising results in prostate cancer (PCa) detection. Vision transformers have achieved competitive performance compared to convolutional neural network (CNN) in deep learning, but they need abundant annotated data for trai...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Medical physics (Lancaster) 2025-02, Vol.52 (2), p.993-1004
Hauptverfasser:	Li, Yuheng, Wynne, Jacob, Wang, Jing, Qiu, Richard L. J., Roper, Justin, Pan, Shaoyan, Jani, Ashesh B., Liu, Tian, Patel, Pretesh R., Mao, Hui, Yang, Xiaofeng
Format:	Artikel
Sprache:	eng
Schlagworte:	Humans Image Processing, Computer-Assisted - methods Magnetic Resonance Imaging Male Neural Networks, Computer prostate cancer Prostatic Neoplasms - diagnostic imaging self‐supervised learning Supervised Machine Learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Background Bi‐parametric magnetic resonance imaging (bpMRI) has demonstrated promising results in prostate cancer (PCa) detection. Vision transformers have achieved competitive performance compared to convolutional neural network (CNN) in deep learning, but they need abundant annotated data for training. Self‐supervised learning can effectively leverage unlabeled data to extract useful semantic representations without annotation and its associated costs. Purpose This study proposes a novel self‐supervised learning framework and a transformer model to enhance PCa detection using prostate bpMRI. Methods and materials We introduce a novel end‐to‐end Cross‐Shaped windows (CSwin) transformer UNet model, CSwin UNet, to detect clinically significant prostate cancer (csPCa) in prostate bpMRI. We also propose a multitask self‐supervised learning framework to leverage unlabeled data and improve network generalizability. Using a large prostate bpMRI dataset (PI‐CAI) with 1476 patients, we first pretrain CSwin transformer using multitask self‐supervised learning to improve data‐efficiency and network generalizability. We then finetune using lesion annotations to perform csPCa detection. We also test the network generalization using a separate bpMRI dataset with 158 patients (Prostate158). Results Five‐fold cross validation shows that self‐supervised CSwin UNet achieves 0.888 ± 0.010 aread under receiver operating characterstics curve (AUC) and 0.545 ± 0.060 Average Precision (AP) on PI‐CAI dataset, significantly outperforming four comparable models (nnFormer, Swin UNETR, DynUNet, Attention UNet, UNet). On model generalizability, self‐supervised CSwin UNet achieves 0.79 AUC and 0.45 AP, still outperforming all other comparable methods and demonstrating good generalization to external data. Conclusions This study proposes CSwin UNet, a new transformer‐based model for end‐to‐end detection of csPCa, enhanced by self‐supervised pretraining to enhance network generalizability. We employ an automatic weighted loss (AWL) to unify pretext tasks, improving representation learning. Evaluated on two multi‐institutional public datasets, our method surpasses existing methods in detection metrics and demonstrates good generalization to external data.
ISSN:	0094-2405 2473-4209 2473-4209
DOI:	10.1002/mp.17546