Skin Cancer Detection utilizing Deep Learning: Classification of Skin Lesion Images using a Vision Transformer
Skin cancer detection still represents a major challenge in healthcare. Common detection methods can be lengthy and require human assistance which falls short in many countries. Previous research demonstrates how convolutional neural networks (CNNs) can help effectively through both automation and a...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Flosdorf, Carolin Engelker, Justin Keller, Igor Mohr, Nicolas |
description | Skin cancer detection still represents a major challenge in healthcare.
Common detection methods can be lengthy and require human assistance which
falls short in many countries. Previous research demonstrates how convolutional
neural networks (CNNs) can help effectively through both automation and an
accuracy that is comparable to the human level. However, despite the progress
in previous decades, the precision is still limited, leading to substantial
misclassifications that have a serious impact on people's health. Hence, we
employ a Vision Transformer (ViT) that has been developed in recent years based
on the idea of a self-attention mechanism, specifically two configurations of a
pre-trained ViT. We generally find superior metrics for classifying skin
lesions after comparing them to base models such as decision tree classifier
and k-nearest neighbor (KNN) classifier, as well as to CNNs and less complex
ViTs. In particular, we attach greater importance to the performance of
melanoma, which is the most lethal type of skin cancer. The ViT-L32 model
achieves an accuracy of 91.57% and a melanoma recall of 58.54%, while ViT-L16
achieves an accuracy of 92.79% and a melanoma recall of 56.10%. This offers a
potential tool for faster and more accurate diagnoses and an overall
improvement for the healthcare sector. |
doi_str_mv | 10.48550/arxiv.2407.18554 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2407_18554</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2407_18554</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2407_185543</originalsourceid><addsrcrecordid>eNqFjsEKglAQRd-mRVQf0Kr5gUxTKdpaUeAuaSuDzJMhfco8jerr00f7VsM93DscpZaB70X7OPY3KC9-etvI33nBAKKpMrcHG0jQFCRwpI6KjhsDfccVf9iUA6MWUkIxQzpAUqG1rLlA12s0uAcp2TFeayzJQm_HJcKdHc0EjdWN1CRzNdFYWVr87kytzqcsuaydWd4K1yjvfDTMnWH4v_EFY55HfQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Skin Cancer Detection utilizing Deep Learning: Classification of Skin Lesion Images using a Vision Transformer</title><source>arXiv.org</source><creator>Flosdorf, Carolin ; Engelker, Justin ; Keller, Igor ; Mohr, Nicolas</creator><creatorcontrib>Flosdorf, Carolin ; Engelker, Justin ; Keller, Igor ; Mohr, Nicolas</creatorcontrib><description>Skin cancer detection still represents a major challenge in healthcare.
Common detection methods can be lengthy and require human assistance which
falls short in many countries. Previous research demonstrates how convolutional
neural networks (CNNs) can help effectively through both automation and an
accuracy that is comparable to the human level. However, despite the progress
in previous decades, the precision is still limited, leading to substantial
misclassifications that have a serious impact on people's health. Hence, we
employ a Vision Transformer (ViT) that has been developed in recent years based
on the idea of a self-attention mechanism, specifically two configurations of a
pre-trained ViT. We generally find superior metrics for classifying skin
lesions after comparing them to base models such as decision tree classifier
and k-nearest neighbor (KNN) classifier, as well as to CNNs and less complex
ViTs. In particular, we attach greater importance to the performance of
melanoma, which is the most lethal type of skin cancer. The ViT-L32 model
achieves an accuracy of 91.57% and a melanoma recall of 58.54%, while ViT-L16
achieves an accuracy of 92.79% and a melanoma recall of 56.10%. This offers a
potential tool for faster and more accurate diagnoses and an overall
improvement for the healthcare sector.</description><identifier>DOI: 10.48550/arxiv.2407.18554</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2024-07</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2407.18554$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2407.18554$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Flosdorf, Carolin</creatorcontrib><creatorcontrib>Engelker, Justin</creatorcontrib><creatorcontrib>Keller, Igor</creatorcontrib><creatorcontrib>Mohr, Nicolas</creatorcontrib><title>Skin Cancer Detection utilizing Deep Learning: Classification of Skin Lesion Images using a Vision Transformer</title><description>Skin cancer detection still represents a major challenge in healthcare.
Common detection methods can be lengthy and require human assistance which
falls short in many countries. Previous research demonstrates how convolutional
neural networks (CNNs) can help effectively through both automation and an
accuracy that is comparable to the human level. However, despite the progress
in previous decades, the precision is still limited, leading to substantial
misclassifications that have a serious impact on people's health. Hence, we
employ a Vision Transformer (ViT) that has been developed in recent years based
on the idea of a self-attention mechanism, specifically two configurations of a
pre-trained ViT. We generally find superior metrics for classifying skin
lesions after comparing them to base models such as decision tree classifier
and k-nearest neighbor (KNN) classifier, as well as to CNNs and less complex
ViTs. In particular, we attach greater importance to the performance of
melanoma, which is the most lethal type of skin cancer. The ViT-L32 model
achieves an accuracy of 91.57% and a melanoma recall of 58.54%, while ViT-L16
achieves an accuracy of 92.79% and a melanoma recall of 56.10%. This offers a
potential tool for faster and more accurate diagnoses and an overall
improvement for the healthcare sector.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjsEKglAQRd-mRVQf0Kr5gUxTKdpaUeAuaSuDzJMhfco8jerr00f7VsM93DscpZaB70X7OPY3KC9-etvI33nBAKKpMrcHG0jQFCRwpI6KjhsDfccVf9iUA6MWUkIxQzpAUqG1rLlA12s0uAcp2TFeayzJQm_HJcKdHc0EjdWN1CRzNdFYWVr87kytzqcsuaydWd4K1yjvfDTMnWH4v_EFY55HfQ</recordid><startdate>20240726</startdate><enddate>20240726</enddate><creator>Flosdorf, Carolin</creator><creator>Engelker, Justin</creator><creator>Keller, Igor</creator><creator>Mohr, Nicolas</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240726</creationdate><title>Skin Cancer Detection utilizing Deep Learning: Classification of Skin Lesion Images using a Vision Transformer</title><author>Flosdorf, Carolin ; Engelker, Justin ; Keller, Igor ; Mohr, Nicolas</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2407_185543</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Flosdorf, Carolin</creatorcontrib><creatorcontrib>Engelker, Justin</creatorcontrib><creatorcontrib>Keller, Igor</creatorcontrib><creatorcontrib>Mohr, Nicolas</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Flosdorf, Carolin</au><au>Engelker, Justin</au><au>Keller, Igor</au><au>Mohr, Nicolas</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Skin Cancer Detection utilizing Deep Learning: Classification of Skin Lesion Images using a Vision Transformer</atitle><date>2024-07-26</date><risdate>2024</risdate><abstract>Skin cancer detection still represents a major challenge in healthcare.
Common detection methods can be lengthy and require human assistance which
falls short in many countries. Previous research demonstrates how convolutional
neural networks (CNNs) can help effectively through both automation and an
accuracy that is comparable to the human level. However, despite the progress
in previous decades, the precision is still limited, leading to substantial
misclassifications that have a serious impact on people's health. Hence, we
employ a Vision Transformer (ViT) that has been developed in recent years based
on the idea of a self-attention mechanism, specifically two configurations of a
pre-trained ViT. We generally find superior metrics for classifying skin
lesions after comparing them to base models such as decision tree classifier
and k-nearest neighbor (KNN) classifier, as well as to CNNs and less complex
ViTs. In particular, we attach greater importance to the performance of
melanoma, which is the most lethal type of skin cancer. The ViT-L32 model
achieves an accuracy of 91.57% and a melanoma recall of 58.54%, while ViT-L16
achieves an accuracy of 92.79% and a melanoma recall of 56.10%. This offers a
potential tool for faster and more accurate diagnoses and an overall
improvement for the healthcare sector.</abstract><doi>10.48550/arxiv.2407.18554</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2407.18554 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2407_18554 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition |
title | Skin Cancer Detection utilizing Deep Learning: Classification of Skin Lesion Images using a Vision Transformer |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-11T20%3A56%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Skin%20Cancer%20Detection%20utilizing%20Deep%20Learning:%20Classification%20of%20Skin%20Lesion%20Images%20using%20a%20Vision%20Transformer&rft.au=Flosdorf,%20Carolin&rft.date=2024-07-26&rft_id=info:doi/10.48550/arxiv.2407.18554&rft_dat=%3Carxiv_GOX%3E2407_18554%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |