Are Natural Domain Foundation Models Useful for Medical Image Classification?
The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The deep learning field is converging towards the use of general foundation
models that can be easily adapted for diverse tasks. While this paradigm shift
has become common practice within the field of natural language processing,
progress has been slower in computer vision. In this paper we attempt to
address this issue by investigating the transferability of various
state-of-the-art foundation models to medical image classification tasks.
Specifically, we evaluate the performance of five foundation models, namely
SAM, SEEM, DINOv2, BLIP, and OpenCLIP across four well-established medical
imaging datasets. We explore different training settings to fully harness the
potential of these models. Our study shows mixed results. DINOv2 consistently
outperforms the standard practice of ImageNet pretraining. However, other
foundation models failed to consistently beat this established baseline
indicating limitations in their transferability to medical image classification
tasks. |
---|---|
DOI: | 10.48550/arxiv.2310.19522 |