Pathologist-like explainable AI for interpretable Gleason grading in prostate cancer
The aggressiveness of prostate cancer, the most common cancer in men worldwide, is primarily assessed based on histopathological data using the Gleason scoring system. While artificial intelligence (AI) has shown promise in accurately predicting Gleason scores, these predictions often lack inherent...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The aggressiveness of prostate cancer, the most common cancer in men
worldwide, is primarily assessed based on histopathological data using the
Gleason scoring system. While artificial intelligence (AI) has shown promise in
accurately predicting Gleason scores, these predictions often lack inherent
explainability, potentially leading to distrust in human-machine interactions.
To address this issue, we introduce a novel dataset of 1,015 tissue microarray
core images, annotated by an international group of 54 pathologists. The
annotations provide detailed localized pattern descriptions for Gleason grading
in line with international guidelines. Utilizing this dataset, we develop an
inherently explainable AI system based on a U-Net architecture that provides
predictions leveraging pathologists' terminology. This approach circumvents
post-hoc explainability methods while maintaining or exceeding the performance
of methods trained directly for Gleason pattern segmentation (Dice score: 0.713
$\pm$ 0.003 trained on explanations vs. 0.691 $\pm$ 0.010 trained on Gleason
patterns). By employing soft labels during training, we capture the intrinsic
uncertainty in the data, yielding strong results in Gleason pattern
segmentation even in the context of high interobserver variability. With the
release of this dataset, we aim to encourage further research into segmentation
in medical tasks with high levels of subjectivity and to advance the
understanding of pathologists' reasoning processes. |
---|---|
DOI: | 10.48550/arxiv.2410.15012 |