CellViT: Vision Transformers for precise cell segmentation and classification

Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While c...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Medical image analysis 2024-05, Vol.94, p.103143-103143, Article 103143
Hauptverfasser:	Hörst, Fabian, Rempe, Moritz, Heine, Lukas, Seibold, Constantin, Keyl, Julius, Baldini, Giulia, Ugurel, Selma, Siveke, Jens, Grünwald, Barbara, Egger, Jan, Kleesiek, Jens
Format:	Artikel
Sprache:	eng
Schlagworte:	Cell Nucleus Cell segmentation Deep learning Digital pathology Eosine Yellowish-(YS) Hematoxylin Humans Image Processing, Computer-Assisted Neural Networks, Computer Staining and Labeling Vision transformer
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	103143
container_issue
container_start_page	103143
container_title	Medical image analysis
container_volume	94
creator	Hörst, Fabian Rempe, Moritz Heine, Lukas Seibold, Constantin Keyl, Julius Baldini, Giulia Ugurel, Selma Siveke, Jens Grünwald, Barbara Egger, Jan Kleesiek, Jens
description	Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in combination with large scale pre-training in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published Segment Anything Model and a ViT-encoder pre-trained on 104 million histological image patches — achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an F1-detection score of 0.83. The code is publicly available at https://github.com/TIO-IKIM/CellViT. [Display omitted] •Novel U-Net-style network for nuclei segmentation using Vision Transformers (CellViT)•Our method outperforms existing techniques and is state-of-the-art on PanNuke•First to embed pre-trained transformer-based foundation models for nuclei segmentation•We demonstrate the generalizability on the MoNuSeg dataset without finetuning
doi_str_mv	10.1016/j.media.2024.103143
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2973105840</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1361841524000689</els_id><sourcerecordid>2973105840</sourcerecordid><originalsourceid>FETCH-LOGICAL-c404t-19bd8610c51872d9446904a7d1d7b42159d775bb4b67202b9b676db477e752653</originalsourceid><addsrcrecordid>eNp9kMlOwzAQQC0EolD4AiSUI5cUO7bjBIkDqtikIi6lV8tbkKssxZMi8fc4TemR01jjN9tD6IrgGcEkv13PGme9mmU4YzFDCaNH6IzQnKQFy-jx4U34BJ0DrDHGgjF8iia04FgUJTtDb3NX1yu_vEtWHnzXJsugWqi60LgASYzJJjjjwSUmggm4z8a1veoHVLU2MbUC8JU3u9QFOqlUDe5yH6fo4-lxOX9JF-_Pr_OHRWoYZn1KSm2LnGDDSSEyWzKWl5gpYYkVmmWEl1YIrjXTuYjX6TLG3GomhBM8yzmdopux7yZ0X1sHvWw8DAuq1nVbkFkpKMG8YDiidERN6ACCq-Qm-EaFH0mwHDzKtdx5lINHOXqMVdf7AVsdfw81f-IicD8CLp757V2QYLxrTewUffXSdv7fAb98VoNL</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2973105840</pqid></control><display><type>article</type><title>CellViT: Vision Transformers for precise cell segmentation and classification</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals Complete</source><creator>Hörst, Fabian ; Rempe, Moritz ; Heine, Lukas ; Seibold, Constantin ; Keyl, Julius ; Baldini, Giulia ; Ugurel, Selma ; Siveke, Jens ; Grünwald, Barbara ; Egger, Jan ; Kleesiek, Jens</creator><creatorcontrib>Hörst, Fabian ; Rempe, Moritz ; Heine, Lukas ; Seibold, Constantin ; Keyl, Julius ; Baldini, Giulia ; Ugurel, Selma ; Siveke, Jens ; Grünwald, Barbara ; Egger, Jan ; Kleesiek, Jens</creatorcontrib><description>Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in combination with large scale pre-training in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published Segment Anything Model and a ViT-encoder pre-trained on 104 million histological image patches — achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an F1-detection score of 0.83. The code is publicly available at https://github.com/TIO-IKIM/CellViT. [Display omitted] •Novel U-Net-style network for nuclei segmentation using Vision Transformers (CellViT)•Our method outperforms existing techniques and is state-of-the-art on PanNuke•First to embed pre-trained transformer-based foundation models for nuclei segmentation•We demonstrate the generalizability on the MoNuSeg dataset without finetuning</description><identifier>ISSN: 1361-8415</identifier><identifier>EISSN: 1361-8423</identifier><identifier>DOI: 10.1016/j.media.2024.103143</identifier><identifier>PMID: 38507894</identifier><language>eng</language><publisher>Netherlands: Elsevier B.V</publisher><subject>Cell Nucleus ; Cell segmentation ; Deep learning ; Digital pathology ; Eosine Yellowish-(YS) ; Hematoxylin ; Humans ; Image Processing, Computer-Assisted ; Neural Networks, Computer ; Staining and Labeling ; Vision transformer</subject><ispartof>Medical image analysis, 2024-05, Vol.94, p.103143-103143, Article 103143</ispartof><rights>2024 The Author(s)</rights><rights>Copyright © 2024 The Author(s). Published by Elsevier B.V. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c404t-19bd8610c51872d9446904a7d1d7b42159d775bb4b67202b9b676db477e752653</citedby><cites>FETCH-LOGICAL-c404t-19bd8610c51872d9446904a7d1d7b42159d775bb4b67202b9b676db477e752653</cites><orcidid>0000-0002-5617-091X ; 0000-0001-6042-8437 ; 0000-0002-7137-2624 ; 0000-0002-5929-0271 ; 0000-0002-8772-4778 ; 0009-0004-3405-0556 ; 0000-0002-8906-7644 ; 0000-0002-7541-5206 ; 0000-0002-9384-6704 ; 0000-0001-8686-0682</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1361841524000689$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38507894$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Hörst, Fabian</creatorcontrib><creatorcontrib>Rempe, Moritz</creatorcontrib><creatorcontrib>Heine, Lukas</creatorcontrib><creatorcontrib>Seibold, Constantin</creatorcontrib><creatorcontrib>Keyl, Julius</creatorcontrib><creatorcontrib>Baldini, Giulia</creatorcontrib><creatorcontrib>Ugurel, Selma</creatorcontrib><creatorcontrib>Siveke, Jens</creatorcontrib><creatorcontrib>Grünwald, Barbara</creatorcontrib><creatorcontrib>Egger, Jan</creatorcontrib><creatorcontrib>Kleesiek, Jens</creatorcontrib><title>CellViT: Vision Transformers for precise cell segmentation and classification</title><title>Medical image analysis</title><addtitle>Med Image Anal</addtitle><description>Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in combination with large scale pre-training in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published Segment Anything Model and a ViT-encoder pre-trained on 104 million histological image patches — achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an F1-detection score of 0.83. The code is publicly available at https://github.com/TIO-IKIM/CellViT. [Display omitted] •Novel U-Net-style network for nuclei segmentation using Vision Transformers (CellViT)•Our method outperforms existing techniques and is state-of-the-art on PanNuke•First to embed pre-trained transformer-based foundation models for nuclei segmentation•We demonstrate the generalizability on the MoNuSeg dataset without finetuning</description><subject>Cell Nucleus</subject><subject>Cell segmentation</subject><subject>Deep learning</subject><subject>Digital pathology</subject><subject>Eosine Yellowish-(YS)</subject><subject>Hematoxylin</subject><subject>Humans</subject><subject>Image Processing, Computer-Assisted</subject><subject>Neural Networks, Computer</subject><subject>Staining and Labeling</subject><subject>Vision transformer</subject><issn>1361-8415</issn><issn>1361-8423</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kMlOwzAQQC0EolD4AiSUI5cUO7bjBIkDqtikIi6lV8tbkKssxZMi8fc4TemR01jjN9tD6IrgGcEkv13PGme9mmU4YzFDCaNH6IzQnKQFy-jx4U34BJ0DrDHGgjF8iia04FgUJTtDb3NX1yu_vEtWHnzXJsugWqi60LgASYzJJjjjwSUmggm4z8a1veoHVLU2MbUC8JU3u9QFOqlUDe5yH6fo4-lxOX9JF-_Pr_OHRWoYZn1KSm2LnGDDSSEyWzKWl5gpYYkVmmWEl1YIrjXTuYjX6TLG3GomhBM8yzmdopux7yZ0X1sHvWw8DAuq1nVbkFkpKMG8YDiidERN6ACCq-Qm-EaFH0mwHDzKtdx5lINHOXqMVdf7AVsdfw81f-IicD8CLp757V2QYLxrTewUffXSdv7fAb98VoNL</recordid><startdate>202405</startdate><enddate>202405</enddate><creator>Hörst, Fabian</creator><creator>Rempe, Moritz</creator><creator>Heine, Lukas</creator><creator>Seibold, Constantin</creator><creator>Keyl, Julius</creator><creator>Baldini, Giulia</creator><creator>Ugurel, Selma</creator><creator>Siveke, Jens</creator><creator>Grünwald, Barbara</creator><creator>Egger, Jan</creator><creator>Kleesiek, Jens</creator><general>Elsevier B.V</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5617-091X</orcidid><orcidid>https://orcid.org/0000-0001-6042-8437</orcidid><orcidid>https://orcid.org/0000-0002-7137-2624</orcidid><orcidid>https://orcid.org/0000-0002-5929-0271</orcidid><orcidid>https://orcid.org/0000-0002-8772-4778</orcidid><orcidid>https://orcid.org/0009-0004-3405-0556</orcidid><orcidid>https://orcid.org/0000-0002-8906-7644</orcidid><orcidid>https://orcid.org/0000-0002-7541-5206</orcidid><orcidid>https://orcid.org/0000-0002-9384-6704</orcidid><orcidid>https://orcid.org/0000-0001-8686-0682</orcidid></search><sort><creationdate>202405</creationdate><title>CellViT: Vision Transformers for precise cell segmentation and classification</title><author>Hörst, Fabian ; Rempe, Moritz ; Heine, Lukas ; Seibold, Constantin ; Keyl, Julius ; Baldini, Giulia ; Ugurel, Selma ; Siveke, Jens ; Grünwald, Barbara ; Egger, Jan ; Kleesiek, Jens</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c404t-19bd8610c51872d9446904a7d1d7b42159d775bb4b67202b9b676db477e752653</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Cell Nucleus</topic><topic>Cell segmentation</topic><topic>Deep learning</topic><topic>Digital pathology</topic><topic>Eosine Yellowish-(YS)</topic><topic>Hematoxylin</topic><topic>Humans</topic><topic>Image Processing, Computer-Assisted</topic><topic>Neural Networks, Computer</topic><topic>Staining and Labeling</topic><topic>Vision transformer</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hörst, Fabian</creatorcontrib><creatorcontrib>Rempe, Moritz</creatorcontrib><creatorcontrib>Heine, Lukas</creatorcontrib><creatorcontrib>Seibold, Constantin</creatorcontrib><creatorcontrib>Keyl, Julius</creatorcontrib><creatorcontrib>Baldini, Giulia</creatorcontrib><creatorcontrib>Ugurel, Selma</creatorcontrib><creatorcontrib>Siveke, Jens</creatorcontrib><creatorcontrib>Grünwald, Barbara</creatorcontrib><creatorcontrib>Egger, Jan</creatorcontrib><creatorcontrib>Kleesiek, Jens</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Medical image analysis</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hörst, Fabian</au><au>Rempe, Moritz</au><au>Heine, Lukas</au><au>Seibold, Constantin</au><au>Keyl, Julius</au><au>Baldini, Giulia</au><au>Ugurel, Selma</au><au>Siveke, Jens</au><au>Grünwald, Barbara</au><au>Egger, Jan</au><au>Kleesiek, Jens</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CellViT: Vision Transformers for precise cell segmentation and classification</atitle><jtitle>Medical image analysis</jtitle><addtitle>Med Image Anal</addtitle><date>2024-05</date><risdate>2024</risdate><volume>94</volume><spage>103143</spage><epage>103143</epage><pages>103143-103143</pages><artnum>103143</artnum><issn>1361-8415</issn><eissn>1361-8423</eissn><abstract>Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in combination with large scale pre-training in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published Segment Anything Model and a ViT-encoder pre-trained on 104 million histological image patches — achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an F1-detection score of 0.83. The code is publicly available at https://github.com/TIO-IKIM/CellViT. [Display omitted] •Novel U-Net-style network for nuclei segmentation using Vision Transformers (CellViT)•Our method outperforms existing techniques and is state-of-the-art on PanNuke•First to embed pre-trained transformer-based foundation models for nuclei segmentation•We demonstrate the generalizability on the MoNuSeg dataset without finetuning</abstract><cop>Netherlands</cop><pub>Elsevier B.V</pub><pmid>38507894</pmid><doi>10.1016/j.media.2024.103143</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-5617-091X</orcidid><orcidid>https://orcid.org/0000-0001-6042-8437</orcidid><orcidid>https://orcid.org/0000-0002-7137-2624</orcidid><orcidid>https://orcid.org/0000-0002-5929-0271</orcidid><orcidid>https://orcid.org/0000-0002-8772-4778</orcidid><orcidid>https://orcid.org/0009-0004-3405-0556</orcidid><orcidid>https://orcid.org/0000-0002-8906-7644</orcidid><orcidid>https://orcid.org/0000-0002-7541-5206</orcidid><orcidid>https://orcid.org/0000-0002-9384-6704</orcidid><orcidid>https://orcid.org/0000-0001-8686-0682</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1361-8415
ispartof	Medical image analysis, 2024-05, Vol.94, p.103143-103143, Article 103143
issn	1361-8415 1361-8423
language	eng
recordid	cdi_proquest_miscellaneous_2973105840
source	MEDLINE; Elsevier ScienceDirect Journals Complete
subjects	Cell Nucleus Cell segmentation Deep learning Digital pathology Eosine Yellowish-(YS) Hematoxylin Humans Image Processing, Computer-Assisted Neural Networks, Computer Staining and Labeling Vision transformer
title	CellViT: Vision Transformers for precise cell segmentation and classification
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-11T14%3A38%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CellViT:%20Vision%20Transformers%20for%20precise%20cell%20segmentation%20and%20classification&rft.jtitle=Medical%20image%20analysis&rft.au=H%C3%B6rst,%20Fabian&rft.date=2024-05&rft.volume=94&rft.spage=103143&rft.epage=103143&rft.pages=103143-103143&rft.artnum=103143&rft.issn=1361-8415&rft.eissn=1361-8423&rft_id=info:doi/10.1016/j.media.2024.103143&rft_dat=%3Cproquest_cross%3E2973105840%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2973105840&rft_id=info:pmid/38507894&rft_els_id=S1361841524000689&rfr_iscdi=true