CellViT: Vision Transformers for precise cell segmentation and classification

Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While c...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Medical image analysis 2024-05, Vol.94, p.103143-103143, Article 103143
Hauptverfasser: Hörst, Fabian, Rempe, Moritz, Heine, Lukas, Seibold, Constantin, Keyl, Julius, Baldini, Giulia, Ugurel, Selma, Siveke, Jens, Grünwald, Barbara, Egger, Jan, Kleesiek, Jens
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 103143
container_issue
container_start_page 103143
container_title Medical image analysis
container_volume 94
creator Hörst, Fabian
Rempe, Moritz
Heine, Lukas
Seibold, Constantin
Keyl, Julius
Baldini, Giulia
Ugurel, Selma
Siveke, Jens
Grünwald, Barbara
Egger, Jan
Kleesiek, Jens
description Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in combination with large scale pre-training in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published Segment Anything Model and a ViT-encoder pre-trained on 104 million histological image patches — achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an F1-detection score of 0.83. The code is publicly available at https://github.com/TIO-IKIM/CellViT. [Display omitted] •Novel U-Net-style network for nuclei segmentation using Vision Transformers (CellViT)•Our method outperforms existing techniques and is state-of-the-art on PanNuke•First to embed pre-trained transformer-based foundation models for nuclei segmentation•We demonstrate the generalizability on the MoNuSeg dataset without finetuning
doi_str_mv 10.1016/j.media.2024.103143
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2973105840</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1361841524000689</els_id><sourcerecordid>2973105840</sourcerecordid><originalsourceid>FETCH-LOGICAL-c404t-19bd8610c51872d9446904a7d1d7b42159d775bb4b67202b9b676db477e752653</originalsourceid><addsrcrecordid>eNp9kMlOwzAQQC0EolD4AiSUI5cUO7bjBIkDqtikIi6lV8tbkKssxZMi8fc4TemR01jjN9tD6IrgGcEkv13PGme9mmU4YzFDCaNH6IzQnKQFy-jx4U34BJ0DrDHGgjF8iia04FgUJTtDb3NX1yu_vEtWHnzXJsugWqi60LgASYzJJjjjwSUmggm4z8a1veoHVLU2MbUC8JU3u9QFOqlUDe5yH6fo4-lxOX9JF-_Pr_OHRWoYZn1KSm2LnGDDSSEyWzKWl5gpYYkVmmWEl1YIrjXTuYjX6TLG3GomhBM8yzmdopux7yZ0X1sHvWw8DAuq1nVbkFkpKMG8YDiidERN6ACCq-Qm-EaFH0mwHDzKtdx5lINHOXqMVdf7AVsdfw81f-IicD8CLp757V2QYLxrTewUffXSdv7fAb98VoNL</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2973105840</pqid></control><display><type>article</type><title>CellViT: Vision Transformers for precise cell segmentation and classification</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals Complete</source><creator>Hörst, Fabian ; Rempe, Moritz ; Heine, Lukas ; Seibold, Constantin ; Keyl, Julius ; Baldini, Giulia ; Ugurel, Selma ; Siveke, Jens ; Grünwald, Barbara ; Egger, Jan ; Kleesiek, Jens</creator><creatorcontrib>Hörst, Fabian ; Rempe, Moritz ; Heine, Lukas ; Seibold, Constantin ; Keyl, Julius ; Baldini, Giulia ; Ugurel, Selma ; Siveke, Jens ; Grünwald, Barbara ; Egger, Jan ; Kleesiek, Jens</creatorcontrib><description>Nuclei detection and segmentation in hematoxylin and eosin-stained (H&amp;E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in combination with large scale pre-training in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published Segment Anything Model and a ViT-encoder pre-trained on 104 million histological image patches — achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an F1-detection score of 0.83. The code is publicly available at https://github.com/TIO-IKIM/CellViT. [Display omitted] •Novel U-Net-style network for nuclei segmentation using Vision Transformers (CellViT)•Our method outperforms existing techniques and is state-of-the-art on PanNuke•First to embed pre-trained transformer-based foundation models for nuclei segmentation•We demonstrate the generalizability on the MoNuSeg dataset without finetuning</description><identifier>ISSN: 1361-8415</identifier><identifier>EISSN: 1361-8423</identifier><identifier>DOI: 10.1016/j.media.2024.103143</identifier><identifier>PMID: 38507894</identifier><language>eng</language><publisher>Netherlands: Elsevier B.V</publisher><subject>Cell Nucleus ; Cell segmentation ; Deep learning ; Digital pathology ; Eosine Yellowish-(YS) ; Hematoxylin ; Humans ; Image Processing, Computer-Assisted ; Neural Networks, Computer ; Staining and Labeling ; Vision transformer</subject><ispartof>Medical image analysis, 2024-05, Vol.94, p.103143-103143, Article 103143</ispartof><rights>2024 The Author(s)</rights><rights>Copyright © 2024 The Author(s). Published by Elsevier B.V. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c404t-19bd8610c51872d9446904a7d1d7b42159d775bb4b67202b9b676db477e752653</citedby><cites>FETCH-LOGICAL-c404t-19bd8610c51872d9446904a7d1d7b42159d775bb4b67202b9b676db477e752653</cites><orcidid>0000-0002-5617-091X ; 0000-0001-6042-8437 ; 0000-0002-7137-2624 ; 0000-0002-5929-0271 ; 0000-0002-8772-4778 ; 0009-0004-3405-0556 ; 0000-0002-8906-7644 ; 0000-0002-7541-5206 ; 0000-0002-9384-6704 ; 0000-0001-8686-0682</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.media.2024.103143$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38507894$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Hörst, Fabian</creatorcontrib><creatorcontrib>Rempe, Moritz</creatorcontrib><creatorcontrib>Heine, Lukas</creatorcontrib><creatorcontrib>Seibold, Constantin</creatorcontrib><creatorcontrib>Keyl, Julius</creatorcontrib><creatorcontrib>Baldini, Giulia</creatorcontrib><creatorcontrib>Ugurel, Selma</creatorcontrib><creatorcontrib>Siveke, Jens</creatorcontrib><creatorcontrib>Grünwald, Barbara</creatorcontrib><creatorcontrib>Egger, Jan</creatorcontrib><creatorcontrib>Kleesiek, Jens</creatorcontrib><title>CellViT: Vision Transformers for precise cell segmentation and classification</title><title>Medical image analysis</title><addtitle>Med Image Anal</addtitle><description>Nuclei detection and segmentation in hematoxylin and eosin-stained (H&amp;E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in combination with large scale pre-training in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published Segment Anything Model and a ViT-encoder pre-trained on 104 million histological image patches — achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an F1-detection score of 0.83. The code is publicly available at https://github.com/TIO-IKIM/CellViT. [Display omitted] •Novel U-Net-style network for nuclei segmentation using Vision Transformers (CellViT)•Our method outperforms existing techniques and is state-of-the-art on PanNuke•First to embed pre-trained transformer-based foundation models for nuclei segmentation•We demonstrate the generalizability on the MoNuSeg dataset without finetuning</description><subject>Cell Nucleus</subject><subject>Cell segmentation</subject><subject>Deep learning</subject><subject>Digital pathology</subject><subject>Eosine Yellowish-(YS)</subject><subject>Hematoxylin</subject><subject>Humans</subject><subject>Image Processing, Computer-Assisted</subject><subject>Neural Networks, Computer</subject><subject>Staining and Labeling</subject><subject>Vision transformer</subject><issn>1361-8415</issn><issn>1361-8423</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kMlOwzAQQC0EolD4AiSUI5cUO7bjBIkDqtikIi6lV8tbkKssxZMi8fc4TemR01jjN9tD6IrgGcEkv13PGme9mmU4YzFDCaNH6IzQnKQFy-jx4U34BJ0DrDHGgjF8iia04FgUJTtDb3NX1yu_vEtWHnzXJsugWqi60LgASYzJJjjjwSUmggm4z8a1veoHVLU2MbUC8JU3u9QFOqlUDe5yH6fo4-lxOX9JF-_Pr_OHRWoYZn1KSm2LnGDDSSEyWzKWl5gpYYkVmmWEl1YIrjXTuYjX6TLG3GomhBM8yzmdopux7yZ0X1sHvWw8DAuq1nVbkFkpKMG8YDiidERN6ACCq-Qm-EaFH0mwHDzKtdx5lINHOXqMVdf7AVsdfw81f-IicD8CLp757V2QYLxrTewUffXSdv7fAb98VoNL</recordid><startdate>202405</startdate><enddate>202405</enddate><creator>Hörst, Fabian</creator><creator>Rempe, Moritz</creator><creator>Heine, Lukas</creator><creator>Seibold, Constantin</creator><creator>Keyl, Julius</creator><creator>Baldini, Giulia</creator><creator>Ugurel, Selma</creator><creator>Siveke, Jens</creator><creator>Grünwald, Barbara</creator><creator>Egger, Jan</creator><creator>Kleesiek, Jens</creator><general>Elsevier B.V</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5617-091X</orcidid><orcidid>https://orcid.org/0000-0001-6042-8437</orcidid><orcidid>https://orcid.org/0000-0002-7137-2624</orcidid><orcidid>https://orcid.org/0000-0002-5929-0271</orcidid><orcidid>https://orcid.org/0000-0002-8772-4778</orcidid><orcidid>https://orcid.org/0009-0004-3405-0556</orcidid><orcidid>https://orcid.org/0000-0002-8906-7644</orcidid><orcidid>https://orcid.org/0000-0002-7541-5206</orcidid><orcidid>https://orcid.org/0000-0002-9384-6704</orcidid><orcidid>https://orcid.org/0000-0001-8686-0682</orcidid></search><sort><creationdate>202405</creationdate><title>CellViT: Vision Transformers for precise cell segmentation and classification</title><author>Hörst, Fabian ; Rempe, Moritz ; Heine, Lukas ; Seibold, Constantin ; Keyl, Julius ; Baldini, Giulia ; Ugurel, Selma ; Siveke, Jens ; Grünwald, Barbara ; Egger, Jan ; Kleesiek, Jens</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c404t-19bd8610c51872d9446904a7d1d7b42159d775bb4b67202b9b676db477e752653</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Cell Nucleus</topic><topic>Cell segmentation</topic><topic>Deep learning</topic><topic>Digital pathology</topic><topic>Eosine Yellowish-(YS)</topic><topic>Hematoxylin</topic><topic>Humans</topic><topic>Image Processing, Computer-Assisted</topic><topic>Neural Networks, Computer</topic><topic>Staining and Labeling</topic><topic>Vision transformer</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hörst, Fabian</creatorcontrib><creatorcontrib>Rempe, Moritz</creatorcontrib><creatorcontrib>Heine, Lukas</creatorcontrib><creatorcontrib>Seibold, Constantin</creatorcontrib><creatorcontrib>Keyl, Julius</creatorcontrib><creatorcontrib>Baldini, Giulia</creatorcontrib><creatorcontrib>Ugurel, Selma</creatorcontrib><creatorcontrib>Siveke, Jens</creatorcontrib><creatorcontrib>Grünwald, Barbara</creatorcontrib><creatorcontrib>Egger, Jan</creatorcontrib><creatorcontrib>Kleesiek, Jens</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Medical image analysis</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hörst, Fabian</au><au>Rempe, Moritz</au><au>Heine, Lukas</au><au>Seibold, Constantin</au><au>Keyl, Julius</au><au>Baldini, Giulia</au><au>Ugurel, Selma</au><au>Siveke, Jens</au><au>Grünwald, Barbara</au><au>Egger, Jan</au><au>Kleesiek, Jens</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CellViT: Vision Transformers for precise cell segmentation and classification</atitle><jtitle>Medical image analysis</jtitle><addtitle>Med Image Anal</addtitle><date>2024-05</date><risdate>2024</risdate><volume>94</volume><spage>103143</spage><epage>103143</epage><pages>103143-103143</pages><artnum>103143</artnum><issn>1361-8415</issn><eissn>1361-8423</eissn><abstract>Nuclei detection and segmentation in hematoxylin and eosin-stained (H&amp;E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in combination with large scale pre-training in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published Segment Anything Model and a ViT-encoder pre-trained on 104 million histological image patches — achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an F1-detection score of 0.83. The code is publicly available at https://github.com/TIO-IKIM/CellViT. [Display omitted] •Novel U-Net-style network for nuclei segmentation using Vision Transformers (CellViT)•Our method outperforms existing techniques and is state-of-the-art on PanNuke•First to embed pre-trained transformer-based foundation models for nuclei segmentation•We demonstrate the generalizability on the MoNuSeg dataset without finetuning</abstract><cop>Netherlands</cop><pub>Elsevier B.V</pub><pmid>38507894</pmid><doi>10.1016/j.media.2024.103143</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-5617-091X</orcidid><orcidid>https://orcid.org/0000-0001-6042-8437</orcidid><orcidid>https://orcid.org/0000-0002-7137-2624</orcidid><orcidid>https://orcid.org/0000-0002-5929-0271</orcidid><orcidid>https://orcid.org/0000-0002-8772-4778</orcidid><orcidid>https://orcid.org/0009-0004-3405-0556</orcidid><orcidid>https://orcid.org/0000-0002-8906-7644</orcidid><orcidid>https://orcid.org/0000-0002-7541-5206</orcidid><orcidid>https://orcid.org/0000-0002-9384-6704</orcidid><orcidid>https://orcid.org/0000-0001-8686-0682</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1361-8415
ispartof Medical image analysis, 2024-05, Vol.94, p.103143-103143, Article 103143
issn 1361-8415
1361-8423
language eng
recordid cdi_proquest_miscellaneous_2973105840
source MEDLINE; Elsevier ScienceDirect Journals Complete
subjects Cell Nucleus
Cell segmentation
Deep learning
Digital pathology
Eosine Yellowish-(YS)
Hematoxylin
Humans
Image Processing, Computer-Assisted
Neural Networks, Computer
Staining and Labeling
Vision transformer
title CellViT: Vision Transformers for precise cell segmentation and classification
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T13%3A49%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CellViT:%20Vision%20Transformers%20for%20precise%20cell%20segmentation%20and%20classification&rft.jtitle=Medical%20image%20analysis&rft.au=H%C3%B6rst,%20Fabian&rft.date=2024-05&rft.volume=94&rft.spage=103143&rft.epage=103143&rft.pages=103143-103143&rft.artnum=103143&rft.issn=1361-8415&rft.eissn=1361-8423&rft_id=info:doi/10.1016/j.media.2024.103143&rft_dat=%3Cproquest_cross%3E2973105840%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2973105840&rft_id=info:pmid/38507894&rft_els_id=S1361841524000689&rfr_iscdi=true