Are Natural Domain Foundation Models Useful for Medical Image Classification?

The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Huix, Joana Palés, Ganeshan, Adithya Raju, Haslum, Johan Fredin, Söderberg, Magnus, Matsoukas, Christos, Smith, Kevin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Huix, Joana Palés
Ganeshan, Adithya Raju
Haslum, Johan Fredin
Söderberg, Magnus
Matsoukas, Christos
Smith, Kevin
description The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt to address this issue by investigating the transferability of various state-of-the-art foundation models to medical image classification tasks. Specifically, we evaluate the performance of five foundation models, namely SAM, SEEM, DINOv2, BLIP, and OpenCLIP across four well-established medical imaging datasets. We explore different training settings to fully harness the potential of these models. Our study shows mixed results. DINOv2 consistently outperforms the standard practice of ImageNet pretraining. However, other foundation models failed to consistently beat this established baseline indicating limitations in their transferability to medical image classification tasks.
doi_str_mv 10.48550/arxiv.2310.19522
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2310_19522</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2310_19522</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-7fed498a5284822650bd7b29e9b03f1e1eac2ac3e31078118cc3111751f1d343</originalsourceid><addsrcrecordid>eNotj8FOwzAQRH3hgAofwAn_QErWjmvnhKpAoVIDh5ZztInXyJKTILtB8PekLaeRnkajeYzdQb4sjFL5A8Yf_70UcgZQKiGuWb2OxN_wOEUM_Gns0Q98M06DxaMfB16PlkLiH4ncFLgbI6_J-m7ubnv8JF4FTMm7mZzqjzfsymFIdPufC7bfPB-q12z3_rKt1rsMV1pk2pEtSoNKmMIIsVJ5a3UrSirbXDogIOwEdpLmn9oAmK6TAKAVOLCykAt2f1k96zRf0fcYf5uTVnPWkn91M0cK</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Are Natural Domain Foundation Models Useful for Medical Image Classification?</title><source>arXiv.org</source><creator>Huix, Joana Palés ; Ganeshan, Adithya Raju ; Haslum, Johan Fredin ; Söderberg, Magnus ; Matsoukas, Christos ; Smith, Kevin</creator><creatorcontrib>Huix, Joana Palés ; Ganeshan, Adithya Raju ; Haslum, Johan Fredin ; Söderberg, Magnus ; Matsoukas, Christos ; Smith, Kevin</creatorcontrib><description>The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt to address this issue by investigating the transferability of various state-of-the-art foundation models to medical image classification tasks. Specifically, we evaluate the performance of five foundation models, namely SAM, SEEM, DINOv2, BLIP, and OpenCLIP across four well-established medical imaging datasets. We explore different training settings to fully harness the potential of these models. Our study shows mixed results. DINOv2 consistently outperforms the standard practice of ImageNet pretraining. However, other foundation models failed to consistently beat this established baseline indicating limitations in their transferability to medical image classification tasks.</description><identifier>DOI: 10.48550/arxiv.2310.19522</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2023-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2310.19522$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2310.19522$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Huix, Joana Palés</creatorcontrib><creatorcontrib>Ganeshan, Adithya Raju</creatorcontrib><creatorcontrib>Haslum, Johan Fredin</creatorcontrib><creatorcontrib>Söderberg, Magnus</creatorcontrib><creatorcontrib>Matsoukas, Christos</creatorcontrib><creatorcontrib>Smith, Kevin</creatorcontrib><title>Are Natural Domain Foundation Models Useful for Medical Image Classification?</title><description>The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt to address this issue by investigating the transferability of various state-of-the-art foundation models to medical image classification tasks. Specifically, we evaluate the performance of five foundation models, namely SAM, SEEM, DINOv2, BLIP, and OpenCLIP across four well-established medical imaging datasets. We explore different training settings to fully harness the potential of these models. Our study shows mixed results. DINOv2 consistently outperforms the standard practice of ImageNet pretraining. However, other foundation models failed to consistently beat this established baseline indicating limitations in their transferability to medical image classification tasks.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8FOwzAQRH3hgAofwAn_QErWjmvnhKpAoVIDh5ZztInXyJKTILtB8PekLaeRnkajeYzdQb4sjFL5A8Yf_70UcgZQKiGuWb2OxN_wOEUM_Gns0Q98M06DxaMfB16PlkLiH4ncFLgbI6_J-m7ubnv8JF4FTMm7mZzqjzfsymFIdPufC7bfPB-q12z3_rKt1rsMV1pk2pEtSoNKmMIIsVJ5a3UrSirbXDogIOwEdpLmn9oAmK6TAKAVOLCykAt2f1k96zRf0fcYf5uTVnPWkn91M0cK</recordid><startdate>20231030</startdate><enddate>20231030</enddate><creator>Huix, Joana Palés</creator><creator>Ganeshan, Adithya Raju</creator><creator>Haslum, Johan Fredin</creator><creator>Söderberg, Magnus</creator><creator>Matsoukas, Christos</creator><creator>Smith, Kevin</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231030</creationdate><title>Are Natural Domain Foundation Models Useful for Medical Image Classification?</title><author>Huix, Joana Palés ; Ganeshan, Adithya Raju ; Haslum, Johan Fredin ; Söderberg, Magnus ; Matsoukas, Christos ; Smith, Kevin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-7fed498a5284822650bd7b29e9b03f1e1eac2ac3e31078118cc3111751f1d343</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Huix, Joana Palés</creatorcontrib><creatorcontrib>Ganeshan, Adithya Raju</creatorcontrib><creatorcontrib>Haslum, Johan Fredin</creatorcontrib><creatorcontrib>Söderberg, Magnus</creatorcontrib><creatorcontrib>Matsoukas, Christos</creatorcontrib><creatorcontrib>Smith, Kevin</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Huix, Joana Palés</au><au>Ganeshan, Adithya Raju</au><au>Haslum, Johan Fredin</au><au>Söderberg, Magnus</au><au>Matsoukas, Christos</au><au>Smith, Kevin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Are Natural Domain Foundation Models Useful for Medical Image Classification?</atitle><date>2023-10-30</date><risdate>2023</risdate><abstract>The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt to address this issue by investigating the transferability of various state-of-the-art foundation models to medical image classification tasks. Specifically, we evaluate the performance of five foundation models, namely SAM, SEEM, DINOv2, BLIP, and OpenCLIP across four well-established medical imaging datasets. We explore different training settings to fully harness the potential of these models. Our study shows mixed results. DINOv2 consistently outperforms the standard practice of ImageNet pretraining. However, other foundation models failed to consistently beat this established baseline indicating limitations in their transferability to medical image classification tasks.</abstract><doi>10.48550/arxiv.2310.19522</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2310.19522
ispartof
issn
language eng
recordid cdi_arxiv_primary_2310_19522
source arXiv.org
subjects Computer Science - Computer Vision and Pattern Recognition
title Are Natural Domain Foundation Models Useful for Medical Image Classification?
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T12%3A58%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Are%20Natural%20Domain%20Foundation%20Models%20Useful%20for%20Medical%20Image%20Classification?&rft.au=Huix,%20Joana%20Pal%C3%A9s&rft.date=2023-10-30&rft_id=info:doi/10.48550/arxiv.2310.19522&rft_dat=%3Carxiv_GOX%3E2310_19522%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true