Leveraging image captions for selective whole slide image annotation
Acquiring annotations for whole slide images (WSIs)-based deep learning tasks, such as creating tissue segmentation masks or detecting mitotic figures, is a laborious process due to the extensive image size and the significant manual work involved in the annotation. This paper focuses on identifying...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Acquiring annotations for whole slide images (WSIs)-based deep learning
tasks, such as creating tissue segmentation masks or detecting mitotic figures,
is a laborious process due to the extensive image size and the significant
manual work involved in the annotation. This paper focuses on identifying and
annotating specific image regions that optimize model training, given a limited
annotation budget. While random sampling helps capture data variance by
collecting annotation regions throughout the WSIs, insufficient data curation
may result in an inadequate representation of minority classes. Recent studies
proposed diversity sampling to select a set of regions that maximally represent
unique characteristics of the WSIs. This is done by pretraining on unlabeled
data through self-supervised learning and then clustering all regions in the
latent space. However, establishing the optimal number of clusters can be
difficult and not all clusters are task-relevant. This paper presents prototype
sampling, a new method for annotation region selection. It discovers regions
exhibiting typical characteristics of each task-specific class. The process
entails recognizing class prototypes from extensive histopathology
image-caption databases and detecting unlabeled image regions that resemble
these prototypes. Our results show that prototype sampling is more effective
than random and diversity sampling in identifying annotation regions with
valuable training information, resulting in improved model performance in
semantic segmentation and mitotic figure detection tasks. Code is available at
https://github.com/DeepMicroscopy/Prototype-sampling. |
---|---|
DOI: | 10.48550/arxiv.2407.06363 |