Are Natural Domain Foundation Models Useful for Medical Image Classification?

The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Huix, Joana Palés, Ganeshan, Adithya Raju, Haslum, Johan Fredin, Söderberg, Magnus, Matsoukas, Christos, Smith, Kevin
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Huix, Joana Palés Ganeshan, Adithya Raju Haslum, Johan Fredin Söderberg, Magnus Matsoukas, Christos Smith, Kevin
description	The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt to address this issue by investigating the transferability of various state-of-the-art foundation models to medical image classification tasks. Specifically, we evaluate the performance of five foundation models, namely SAM, SEEM, DINOv2, BLIP, and OpenCLIP across four well-established medical imaging datasets. We explore different training settings to fully harness the potential of these models. Our study shows mixed results. DINOv2 consistently outperforms the standard practice of ImageNet pretraining. However, other foundation models failed to consistently beat this established baseline indicating limitations in their transferability to medical image classification tasks.
doi_str_mv	10.48550/arxiv.2310.19522
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2310_19522</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2310_19522</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-7fed498a5284822650bd7b29e9b03f1e1eac2ac3e31078118cc3111751f1d343</originalsourceid><addsrcrecordid>eNotj8FOwzAQRH3hgAofwAn_QErWjmvnhKpAoVIDh5ZztInXyJKTILtB8PekLaeRnkajeYzdQb4sjFL5A8Yf_70UcgZQKiGuWb2OxN_wOEUM_Gns0Q98M06DxaMfB16PlkLiH4ncFLgbI6_J-m7ubnv8JF4FTMm7mZzqjzfsymFIdPufC7bfPB-q12z3_rKt1rsMV1pk2pEtSoNKmMIIsVJ5a3UrSirbXDogIOwEdpLmn9oAmK6TAKAVOLCykAt2f1k96zRf0fcYf5uTVnPWkn91M0cK</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Are Natural Domain Foundation Models Useful for Medical Image Classification?</title><source>arXiv.org</source><creator>Huix, Joana Palés ; Ganeshan, Adithya Raju ; Haslum, Johan Fredin ; Söderberg, Magnus ; Matsoukas, Christos ; Smith, Kevin</creator><creatorcontrib>Huix, Joana Palés ; Ganeshan, Adithya Raju ; Haslum, Johan Fredin ; Söderberg, Magnus ; Matsoukas, Christos ; Smith, Kevin</creatorcontrib><description>The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt to address this issue by investigating the transferability of various state-of-the-art foundation models to medical image classification tasks. Specifically, we evaluate the performance of five foundation models, namely SAM, SEEM, DINOv2, BLIP, and OpenCLIP across four well-established medical imaging datasets. We explore different training settings to fully harness the potential of these models. Our study shows mixed results. DINOv2 consistently outperforms the standard practice of ImageNet pretraining. However, other foundation models failed to consistently beat this established baseline indicating limitations in their transferability to medical image classification tasks.</description><identifier>DOI: 10.48550/arxiv.2310.19522</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2023-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2310.19522$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2310.19522$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Huix, Joana Palés</creatorcontrib><creatorcontrib>Ganeshan, Adithya Raju</creatorcontrib><creatorcontrib>Haslum, Johan Fredin</creatorcontrib><creatorcontrib>Söderberg, Magnus</creatorcontrib><creatorcontrib>Matsoukas, Christos</creatorcontrib><creatorcontrib>Smith, Kevin</creatorcontrib><title>Are Natural Domain Foundation Models Useful for Medical Image Classification?</title><description>The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt to address this issue by investigating the transferability of various state-of-the-art foundation models to medical image classification tasks. Specifically, we evaluate the performance of five foundation models, namely SAM, SEEM, DINOv2, BLIP, and OpenCLIP across four well-established medical imaging datasets. We explore different training settings to fully harness the potential of these models. Our study shows mixed results. DINOv2 consistently outperforms the standard practice of ImageNet pretraining. However, other foundation models failed to consistently beat this established baseline indicating limitations in their transferability to medical image classification tasks.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8FOwzAQRH3hgAofwAn_QErWjmvnhKpAoVIDh5ZztInXyJKTILtB8PekLaeRnkajeYzdQb4sjFL5A8Yf_70UcgZQKiGuWb2OxN_wOEUM_Gns0Q98M06DxaMfB16PlkLiH4ncFLgbI6_J-m7ubnv8JF4FTMm7mZzqjzfsymFIdPufC7bfPB-q12z3_rKt1rsMV1pk2pEtSoNKmMIIsVJ5a3UrSirbXDogIOwEdpLmn9oAmK6TAKAVOLCykAt2f1k96zRf0fcYf5uTVnPWkn91M0cK</recordid><startdate>20231030</startdate><enddate>20231030</enddate><creator>Huix, Joana Palés</creator><creator>Ganeshan, Adithya Raju</creator><creator>Haslum, Johan Fredin</creator><creator>Söderberg, Magnus</creator><creator>Matsoukas, Christos</creator><creator>Smith, Kevin</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231030</creationdate><title>Are Natural Domain Foundation Models Useful for Medical Image Classification?</title><author>Huix, Joana Palés ; Ganeshan, Adithya Raju ; Haslum, Johan Fredin ; Söderberg, Magnus ; Matsoukas, Christos ; Smith, Kevin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-7fed498a5284822650bd7b29e9b03f1e1eac2ac3e31078118cc3111751f1d343</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Huix, Joana Palés</creatorcontrib><creatorcontrib>Ganeshan, Adithya Raju</creatorcontrib><creatorcontrib>Haslum, Johan Fredin</creatorcontrib><creatorcontrib>Söderberg, Magnus</creatorcontrib><creatorcontrib>Matsoukas, Christos</creatorcontrib><creatorcontrib>Smith, Kevin</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Huix, Joana Palés</au><au>Ganeshan, Adithya Raju</au><au>Haslum, Johan Fredin</au><au>Söderberg, Magnus</au><au>Matsoukas, Christos</au><au>Smith, Kevin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Are Natural Domain Foundation Models Useful for Medical Image Classification?</atitle><date>2023-10-30</date><risdate>2023</risdate><abstract>The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt to address this issue by investigating the transferability of various state-of-the-art foundation models to medical image classification tasks. Specifically, we evaluate the performance of five foundation models, namely SAM, SEEM, DINOv2, BLIP, and OpenCLIP across four well-established medical imaging datasets. We explore different training settings to fully harness the potential of these models. Our study shows mixed results. DINOv2 consistently outperforms the standard practice of ImageNet pretraining. However, other foundation models failed to consistently beat this established baseline indicating limitations in their transferability to medical image classification tasks.</abstract><doi>10.48550/arxiv.2310.19522</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2310.19522
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2310_19522
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition
title	Are Natural Domain Foundation Models Useful for Medical Image Classification?
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T12%3A58%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Are%20Natural%20Domain%20Foundation%20Models%20Useful%20for%20Medical%20Image%20Classification?&rft.au=Huix,%20Joana%20Pal%C3%A9s&rft.date=2023-10-30&rft_id=info:doi/10.48550/arxiv.2310.19522&rft_dat=%3Carxiv_GOX%3E2310_19522%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true