Federated learning for computational pathology on gigapixel whole slide images

•We present the first large-scale application of privacy-preserving federated learning to weakly supervised computational pathology on gigapixel whole slide images.•Validation on multi-class classification, binary classification and survival prediction using multi-institutional datasets on two diffe...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Medical image analysis 2022-02, Vol.76, p.102298-102298, Article 102298
Hauptverfasser: Lu, Ming Y., Chen, Richard J., Kong, Dehan, Lipkova, Jana, Singh, Rajendra, Williamson, Drew F.K., Chen, Tiffany Y., Mahmood, Faisal
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 102298
container_issue
container_start_page 102298
container_title Medical image analysis
container_volume 76
creator Lu, Ming Y.
Chen, Richard J.
Kong, Dehan
Lipkova, Jana
Singh, Rajendra
Williamson, Drew F.K.
Chen, Tiffany Y.
Mahmood, Faisal
description •We present the first large-scale application of privacy-preserving federated learning to weakly supervised computational pathology on gigapixel whole slide images.•Validation on multi-class classification, binary classification and survival prediction using multi-institutional datasets on two different disease models using thousands of gigapixel whole slide images.•Multiple instance learning-inspired framework for interpretable, weakly-supervised survival prediction from histology whole slides using patient-level labels from multiˇcentric data. Deep Learning-based computational pathology algorithms have demonstrated profound ability to excel in a wide array of tasks that range from characterization of well known morphological phenotypes to predicting non human-identifiable features from histology such as molecular alterations. However, the development of robust, adaptable and accurate deep learning-based models often rely on the collection and time-costly curation large high-quality annotated training data that should ideally come from diverse sources and patient populations to cater for the heterogeneity that exists in such datasets. Multi-centric and collaborative integration of medical data across multiple institutions can naturally help overcome this challenge and boost the model performance but is limited by privacy concerns among other difficulties that may arise in the complex data sharing process as models scale towards using hundreds of thousands of gigapixel whole slide images. In this paper, we introduce privacy-preserving federated learning for gigapixel whole slide images in computational pathology using weakly-supervised attention multiple instance learning and differential privacy. We evaluated our approach on two different diagnostic problems using thousands of histology whole slide images with only slide-level labels. Additionally, we present a weakly-supervised learning framework for survival prediction and patient stratification from whole slide images and demonstrate its effectiveness in a federated setting. Our results show that using federated learning, we can effectively develop accurate weakly-supervised deep learning models from distributed data silos without direct data sharing and its associated complexities, while also preserving differential privacy using randomized noise generation. We also make available an easy-to-use federated learning for computational pathology software package: http://github.com/mahmoodlab/HistoFL.
doi_str_mv 10.1016/j.media.2021.102298
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9340569</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1361841521003431</els_id><sourcerecordid>2610911920</sourcerecordid><originalsourceid>FETCH-LOGICAL-c553t-5c42c2f791a36af9db703a11de6f58daea6d03d8e660b94852388ad999a5199c3</originalsourceid><addsrcrecordid>eNp9kU9vEzEQxVcIREvhEyAhS1y4JPh_1geQqooWpKpc4GxN7NmtI2e92LuFfnucpo1oD5xsjX_zZvxe07xldMko0x83yy36AEtOOasVzk37rDlmQrNFK7l4frgzddS8KmVDKV1JSV82R0IaVjXEcXN1jh4zTOhJRMhDGHrSpUxc2o7zBFNIA0QywnSdYupvSRpIH3oYwx-M5HctIikxeCRhCz2W182LDmLBN_fnSfPz_MuPs6-Ly-8X385OLxdOKTEtlJPc8W5lGAgNnfHrFRXAmEfdqdYDgvZU-Ba1pmsjW8VF24I3xoBixjhx0nze647zurrgcJgyRDvmuka-tQmCffwyhGvbpxtrhKRKmyrw4V4gp18zlsluQ3EYIwyY5mK5ZrR6ZDit6Psn6CbNudqyo_SqVZJKUSmxp1xOpWTsDsswand52Y29y8vu8rL7vGrXu3__ceh5CKgCn_YAVjdvAmZbXMDBVaWMbrI-hf8O-AvbZKgz</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2667854043</pqid></control><display><type>article</type><title>Federated learning for computational pathology on gigapixel whole slide images</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><creator>Lu, Ming Y. ; Chen, Richard J. ; Kong, Dehan ; Lipkova, Jana ; Singh, Rajendra ; Williamson, Drew F.K. ; Chen, Tiffany Y. ; Mahmood, Faisal</creator><creatorcontrib>Lu, Ming Y. ; Chen, Richard J. ; Kong, Dehan ; Lipkova, Jana ; Singh, Rajendra ; Williamson, Drew F.K. ; Chen, Tiffany Y. ; Mahmood, Faisal</creatorcontrib><description>•We present the first large-scale application of privacy-preserving federated learning to weakly supervised computational pathology on gigapixel whole slide images.•Validation on multi-class classification, binary classification and survival prediction using multi-institutional datasets on two different disease models using thousands of gigapixel whole slide images.•Multiple instance learning-inspired framework for interpretable, weakly-supervised survival prediction from histology whole slides using patient-level labels from multiˇcentric data. Deep Learning-based computational pathology algorithms have demonstrated profound ability to excel in a wide array of tasks that range from characterization of well known morphological phenotypes to predicting non human-identifiable features from histology such as molecular alterations. However, the development of robust, adaptable and accurate deep learning-based models often rely on the collection and time-costly curation large high-quality annotated training data that should ideally come from diverse sources and patient populations to cater for the heterogeneity that exists in such datasets. Multi-centric and collaborative integration of medical data across multiple institutions can naturally help overcome this challenge and boost the model performance but is limited by privacy concerns among other difficulties that may arise in the complex data sharing process as models scale towards using hundreds of thousands of gigapixel whole slide images. In this paper, we introduce privacy-preserving federated learning for gigapixel whole slide images in computational pathology using weakly-supervised attention multiple instance learning and differential privacy. We evaluated our approach on two different diagnostic problems using thousands of histology whole slide images with only slide-level labels. Additionally, we present a weakly-supervised learning framework for survival prediction and patient stratification from whole slide images and demonstrate its effectiveness in a federated setting. Our results show that using federated learning, we can effectively develop accurate weakly-supervised deep learning models from distributed data silos without direct data sharing and its associated complexities, while also preserving differential privacy using randomized noise generation. We also make available an easy-to-use federated learning for computational pathology software package: http://github.com/mahmoodlab/HistoFL.</description><identifier>ISSN: 1361-8415</identifier><identifier>EISSN: 1361-8423</identifier><identifier>DOI: 10.1016/j.media.2021.102298</identifier><identifier>PMID: 34911013</identifier><language>eng</language><publisher>Netherlands: Elsevier B.V</publisher><subject>Algorithms ; Computational pathology ; Computer applications ; Data retrieval ; Deep learning ; Federated learning ; Heterogeneity ; Histological Techniques ; Histology ; Humans ; Information sharing ; Machine learning ; Medical imaging ; Noise generation ; Pathology ; Patients ; Phenotypes ; Privacy ; Split learning ; Survival ; Whole slide imaging</subject><ispartof>Medical image analysis, 2022-02, Vol.76, p.102298-102298, Article 102298</ispartof><rights>2021 The Author(s)</rights><rights>Copyright © 2021 The Author(s). Published by Elsevier B.V. All rights reserved.</rights><rights>Copyright Elsevier BV Feb 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c553t-5c42c2f791a36af9db703a11de6f58daea6d03d8e660b94852388ad999a5199c3</citedby><cites>FETCH-LOGICAL-c553t-5c42c2f791a36af9db703a11de6f58daea6d03d8e660b94852388ad999a5199c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1361841521003431$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>230,314,776,780,881,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34911013$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Lu, Ming Y.</creatorcontrib><creatorcontrib>Chen, Richard J.</creatorcontrib><creatorcontrib>Kong, Dehan</creatorcontrib><creatorcontrib>Lipkova, Jana</creatorcontrib><creatorcontrib>Singh, Rajendra</creatorcontrib><creatorcontrib>Williamson, Drew F.K.</creatorcontrib><creatorcontrib>Chen, Tiffany Y.</creatorcontrib><creatorcontrib>Mahmood, Faisal</creatorcontrib><title>Federated learning for computational pathology on gigapixel whole slide images</title><title>Medical image analysis</title><addtitle>Med Image Anal</addtitle><description>•We present the first large-scale application of privacy-preserving federated learning to weakly supervised computational pathology on gigapixel whole slide images.•Validation on multi-class classification, binary classification and survival prediction using multi-institutional datasets on two different disease models using thousands of gigapixel whole slide images.•Multiple instance learning-inspired framework for interpretable, weakly-supervised survival prediction from histology whole slides using patient-level labels from multiˇcentric data. Deep Learning-based computational pathology algorithms have demonstrated profound ability to excel in a wide array of tasks that range from characterization of well known morphological phenotypes to predicting non human-identifiable features from histology such as molecular alterations. However, the development of robust, adaptable and accurate deep learning-based models often rely on the collection and time-costly curation large high-quality annotated training data that should ideally come from diverse sources and patient populations to cater for the heterogeneity that exists in such datasets. Multi-centric and collaborative integration of medical data across multiple institutions can naturally help overcome this challenge and boost the model performance but is limited by privacy concerns among other difficulties that may arise in the complex data sharing process as models scale towards using hundreds of thousands of gigapixel whole slide images. In this paper, we introduce privacy-preserving federated learning for gigapixel whole slide images in computational pathology using weakly-supervised attention multiple instance learning and differential privacy. We evaluated our approach on two different diagnostic problems using thousands of histology whole slide images with only slide-level labels. Additionally, we present a weakly-supervised learning framework for survival prediction and patient stratification from whole slide images and demonstrate its effectiveness in a federated setting. Our results show that using federated learning, we can effectively develop accurate weakly-supervised deep learning models from distributed data silos without direct data sharing and its associated complexities, while also preserving differential privacy using randomized noise generation. We also make available an easy-to-use federated learning for computational pathology software package: http://github.com/mahmoodlab/HistoFL.</description><subject>Algorithms</subject><subject>Computational pathology</subject><subject>Computer applications</subject><subject>Data retrieval</subject><subject>Deep learning</subject><subject>Federated learning</subject><subject>Heterogeneity</subject><subject>Histological Techniques</subject><subject>Histology</subject><subject>Humans</subject><subject>Information sharing</subject><subject>Machine learning</subject><subject>Medical imaging</subject><subject>Noise generation</subject><subject>Pathology</subject><subject>Patients</subject><subject>Phenotypes</subject><subject>Privacy</subject><subject>Split learning</subject><subject>Survival</subject><subject>Whole slide imaging</subject><issn>1361-8415</issn><issn>1361-8423</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kU9vEzEQxVcIREvhEyAhS1y4JPh_1geQqooWpKpc4GxN7NmtI2e92LuFfnucpo1oD5xsjX_zZvxe07xldMko0x83yy36AEtOOasVzk37rDlmQrNFK7l4frgzddS8KmVDKV1JSV82R0IaVjXEcXN1jh4zTOhJRMhDGHrSpUxc2o7zBFNIA0QywnSdYupvSRpIH3oYwx-M5HctIikxeCRhCz2W182LDmLBN_fnSfPz_MuPs6-Ly-8X385OLxdOKTEtlJPc8W5lGAgNnfHrFRXAmEfdqdYDgvZU-Ba1pmsjW8VF24I3xoBixjhx0nze647zurrgcJgyRDvmuka-tQmCffwyhGvbpxtrhKRKmyrw4V4gp18zlsluQ3EYIwyY5mK5ZrR6ZDit6Psn6CbNudqyo_SqVZJKUSmxp1xOpWTsDsswand52Y29y8vu8rL7vGrXu3__ceh5CKgCn_YAVjdvAmZbXMDBVaWMbrI-hf8O-AvbZKgz</recordid><startdate>20220201</startdate><enddate>20220201</enddate><creator>Lu, Ming Y.</creator><creator>Chen, Richard J.</creator><creator>Kong, Dehan</creator><creator>Lipkova, Jana</creator><creator>Singh, Rajendra</creator><creator>Williamson, Drew F.K.</creator><creator>Chen, Tiffany Y.</creator><creator>Mahmood, Faisal</creator><general>Elsevier B.V</general><general>Elsevier BV</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>8FD</scope><scope>FR3</scope><scope>K9.</scope><scope>NAPCQ</scope><scope>P64</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20220201</creationdate><title>Federated learning for computational pathology on gigapixel whole slide images</title><author>Lu, Ming Y. ; Chen, Richard J. ; Kong, Dehan ; Lipkova, Jana ; Singh, Rajendra ; Williamson, Drew F.K. ; Chen, Tiffany Y. ; Mahmood, Faisal</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c553t-5c42c2f791a36af9db703a11de6f58daea6d03d8e660b94852388ad999a5199c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Computational pathology</topic><topic>Computer applications</topic><topic>Data retrieval</topic><topic>Deep learning</topic><topic>Federated learning</topic><topic>Heterogeneity</topic><topic>Histological Techniques</topic><topic>Histology</topic><topic>Humans</topic><topic>Information sharing</topic><topic>Machine learning</topic><topic>Medical imaging</topic><topic>Noise generation</topic><topic>Pathology</topic><topic>Patients</topic><topic>Phenotypes</topic><topic>Privacy</topic><topic>Split learning</topic><topic>Survival</topic><topic>Whole slide imaging</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lu, Ming Y.</creatorcontrib><creatorcontrib>Chen, Richard J.</creatorcontrib><creatorcontrib>Kong, Dehan</creatorcontrib><creatorcontrib>Lipkova, Jana</creatorcontrib><creatorcontrib>Singh, Rajendra</creatorcontrib><creatorcontrib>Williamson, Drew F.K.</creatorcontrib><creatorcontrib>Chen, Tiffany Y.</creatorcontrib><creatorcontrib>Mahmood, Faisal</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Nursing &amp; Allied Health Premium</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Medical image analysis</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lu, Ming Y.</au><au>Chen, Richard J.</au><au>Kong, Dehan</au><au>Lipkova, Jana</au><au>Singh, Rajendra</au><au>Williamson, Drew F.K.</au><au>Chen, Tiffany Y.</au><au>Mahmood, Faisal</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Federated learning for computational pathology on gigapixel whole slide images</atitle><jtitle>Medical image analysis</jtitle><addtitle>Med Image Anal</addtitle><date>2022-02-01</date><risdate>2022</risdate><volume>76</volume><spage>102298</spage><epage>102298</epage><pages>102298-102298</pages><artnum>102298</artnum><issn>1361-8415</issn><eissn>1361-8423</eissn><abstract>•We present the first large-scale application of privacy-preserving federated learning to weakly supervised computational pathology on gigapixel whole slide images.•Validation on multi-class classification, binary classification and survival prediction using multi-institutional datasets on two different disease models using thousands of gigapixel whole slide images.•Multiple instance learning-inspired framework for interpretable, weakly-supervised survival prediction from histology whole slides using patient-level labels from multiˇcentric data. Deep Learning-based computational pathology algorithms have demonstrated profound ability to excel in a wide array of tasks that range from characterization of well known morphological phenotypes to predicting non human-identifiable features from histology such as molecular alterations. However, the development of robust, adaptable and accurate deep learning-based models often rely on the collection and time-costly curation large high-quality annotated training data that should ideally come from diverse sources and patient populations to cater for the heterogeneity that exists in such datasets. Multi-centric and collaborative integration of medical data across multiple institutions can naturally help overcome this challenge and boost the model performance but is limited by privacy concerns among other difficulties that may arise in the complex data sharing process as models scale towards using hundreds of thousands of gigapixel whole slide images. In this paper, we introduce privacy-preserving federated learning for gigapixel whole slide images in computational pathology using weakly-supervised attention multiple instance learning and differential privacy. We evaluated our approach on two different diagnostic problems using thousands of histology whole slide images with only slide-level labels. Additionally, we present a weakly-supervised learning framework for survival prediction and patient stratification from whole slide images and demonstrate its effectiveness in a federated setting. Our results show that using federated learning, we can effectively develop accurate weakly-supervised deep learning models from distributed data silos without direct data sharing and its associated complexities, while also preserving differential privacy using randomized noise generation. We also make available an easy-to-use federated learning for computational pathology software package: http://github.com/mahmoodlab/HistoFL.</abstract><cop>Netherlands</cop><pub>Elsevier B.V</pub><pmid>34911013</pmid><doi>10.1016/j.media.2021.102298</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1361-8415
ispartof Medical image analysis, 2022-02, Vol.76, p.102298-102298, Article 102298
issn 1361-8415
1361-8423
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9340569
source MEDLINE; Elsevier ScienceDirect Journals
subjects Algorithms
Computational pathology
Computer applications
Data retrieval
Deep learning
Federated learning
Heterogeneity
Histological Techniques
Histology
Humans
Information sharing
Machine learning
Medical imaging
Noise generation
Pathology
Patients
Phenotypes
Privacy
Split learning
Survival
Whole slide imaging
title Federated learning for computational pathology on gigapixel whole slide images
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T10%3A30%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Federated%20learning%20for%20computational%20pathology%20on%20gigapixel%20whole%20slide%20images&rft.jtitle=Medical%20image%20analysis&rft.au=Lu,%20Ming%20Y.&rft.date=2022-02-01&rft.volume=76&rft.spage=102298&rft.epage=102298&rft.pages=102298-102298&rft.artnum=102298&rft.issn=1361-8415&rft.eissn=1361-8423&rft_id=info:doi/10.1016/j.media.2021.102298&rft_dat=%3Cproquest_pubme%3E2610911920%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2667854043&rft_id=info:pmid/34911013&rft_els_id=S1361841521003431&rfr_iscdi=true