Uncertainty-Informed Deep Transfer Learning of Perfluoroalkyl and Polyfluoroalkyl Substance Toxicity
Perfluoroalkyl and polyfluoroalkyl substances (PFAS) pose a significant hazard because of their widespread industrial uses, environmental persistence, and bioaccumulation. A growing, increasingly diverse inventory of PFAS, including 8163 chemicals, has recently been updated by the U.S. Environmental...
Gespeichert in:
Veröffentlicht in: | Journal of chemical information and modeling 2021-12, Vol.61 (12), p.5793-5803 |
---|---|
Hauptverfasser: | , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 5803 |
---|---|
container_issue | 12 |
container_start_page | 5793 |
container_title | Journal of chemical information and modeling |
container_volume | 61 |
creator | Feinstein, Jeremy Sivaraman, Ganesh Picel, Kurt Peters, Brian Vázquez-Mayagoitia, Álvaro Ramanathan, Arvind MacDonell, Margaret Foster, Ian Yan, Eugene |
description | Perfluoroalkyl and polyfluoroalkyl substances (PFAS) pose a significant hazard because of their widespread industrial uses, environmental persistence, and bioaccumulation. A growing, increasingly diverse inventory of PFAS, including 8163 chemicals, has recently been updated by the U.S. Environmental Protection Agency. However, with the exception of a handful of well-studied examples, little is known about their human toxicity potential because of the substantial resources required for in vivo toxicity experiments. We tackle the problem of expensive in vivo experiments by evaluating multiple machine learning (ML) methods, including random forests, deep neural networks (DNN), graph convolutional networks, and Gaussian processes, for predicting acute toxicity (e.g., median lethal dose, or LD50) of PFAS compounds. To address the scarcity of toxicity information for PFAS, publicly available datasets of oral rat LD50 for all organic compounds are aggregated and used to develop state-of-the-art ML source models for transfer learning. A total of 519 fluorinated compounds containing two or more C-F bonds with known toxicity are used for knowledge transfer to ensembles of the best-performing source model, DNN, to generate the target models for the PFAS domain with access to uncertainty. This study predicts toxicity for PFAS with a defined chemical structure. To further inform prediction confidence, the transfer-learned model is embedded within a SelectiveNet architecture, where the model is allowed to identify regions of prediction with greater confidence and abstain from those with high uncertainty using a calibrated cutoff rate. |
doi_str_mv | 10.1021/acs.jcim.1c01204 |
format | Article |
fullrecord | <record><control><sourceid>proquest_osti_</sourceid><recordid>TN_cdi_osti_scitechconnect_1835619</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2610414325</sourcerecordid><originalsourceid>FETCH-LOGICAL-a433t-d540e1689405bd47fabc511af1e2382a94bb89f481802dcbca0f9be1797e9dee3</originalsourceid><addsrcrecordid>eNp1kcFr2zAUh8VYWbtu952G2S471JmeJCvWsXRbWwis0BR2E7L8tDlzpFSyof7vqzRJKYOdJMT3-z2ePkI-AJ0BZfDV2DRb2W49A0uBUfGKnEAlVKkk_fX6cK-UPCZvU1pRyrmS7A055kLRiov6hLR33mIcTOeHqbz2LsQ1tsU3xE2xjMYnh7FYoIm-87-L4IobjK4fQwym_zv1hfFtcRP66eXb7dikweTaYhkeOtsN0zty5Eyf8P3-PCV3P74vL67Kxc_L64vzRWkE50PZVoIiyFoJWjWtmDvT2ArAOEDGa2aUaJpaOVFDTVlrG2uoUw3CXM1RtYj8lHza9YY0dDrl0Wj_2OA92kFDzSsJKkNfdtAmhvsR06DXXbLY98ZjGJNmEqgAwVmV0c__oKswRp9X2FK1YEJKmSm6o2wMKUV0ehO7tYmTBqq3mnTWpLea9F5TjnzcF49N_u_nwMFLBs52wFP0MPS_fY_VSZ9E</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2618424666</pqid></control><display><type>article</type><title>Uncertainty-Informed Deep Transfer Learning of Perfluoroalkyl and Polyfluoroalkyl Substance Toxicity</title><source>MEDLINE</source><source>ACS Publications</source><creator>Feinstein, Jeremy ; Sivaraman, Ganesh ; Picel, Kurt ; Peters, Brian ; Vázquez-Mayagoitia, Álvaro ; Ramanathan, Arvind ; MacDonell, Margaret ; Foster, Ian ; Yan, Eugene</creator><creatorcontrib>Feinstein, Jeremy ; Sivaraman, Ganesh ; Picel, Kurt ; Peters, Brian ; Vázquez-Mayagoitia, Álvaro ; Ramanathan, Arvind ; MacDonell, Margaret ; Foster, Ian ; Yan, Eugene ; Argonne National Lab. (ANL), Argonne, IL (United States)</creatorcontrib><description>Perfluoroalkyl and polyfluoroalkyl substances (PFAS) pose a significant hazard because of their widespread industrial uses, environmental persistence, and bioaccumulation. A growing, increasingly diverse inventory of PFAS, including 8163 chemicals, has recently been updated by the U.S. Environmental Protection Agency. However, with the exception of a handful of well-studied examples, little is known about their human toxicity potential because of the substantial resources required for in vivo toxicity experiments. We tackle the problem of expensive in vivo experiments by evaluating multiple machine learning (ML) methods, including random forests, deep neural networks (DNN), graph convolutional networks, and Gaussian processes, for predicting acute toxicity (e.g., median lethal dose, or LD50) of PFAS compounds. To address the scarcity of toxicity information for PFAS, publicly available datasets of oral rat LD50 for all organic compounds are aggregated and used to develop state-of-the-art ML source models for transfer learning. A total of 519 fluorinated compounds containing two or more C-F bonds with known toxicity are used for knowledge transfer to ensembles of the best-performing source model, DNN, to generate the target models for the PFAS domain with access to uncertainty. This study predicts toxicity for PFAS with a defined chemical structure. To further inform prediction confidence, the transfer-learned model is embedded within a SelectiveNet architecture, where the model is allowed to identify regions of prediction with greater confidence and abstain from those with high uncertainty using a calibrated cutoff rate.</description><identifier>ISSN: 1549-9596</identifier><identifier>EISSN: 1549-960X</identifier><identifier>DOI: 10.1021/acs.jcim.1c01204</identifier><identifier>PMID: 34905348</identifier><language>eng</language><publisher>United States: American Chemical Society</publisher><subject>Animals ; Artificial neural networks ; Bioaccumulation ; Biocompatibility ; Chemical bonds ; Environmental protection ; Fluorocarbons - chemistry ; Fluorocarbons - toxicity ; Gaussian process ; In vivo methods and tests ; Industrial applications ; Knowledge management ; Layers ; Machine Learning ; Machine Learning and Deep Learning ; Molecules ; Neural networks ; Neural Networks, Computer ; Organic compounds ; Perfluoroalkyl & polyfluoroalkyl substances ; Rats ; Rodent models ; Toxicity ; Uncertainty</subject><ispartof>Journal of chemical information and modeling, 2021-12, Vol.61 (12), p.5793-5803</ispartof><rights>2021 UChicago Argonne, LLC. Published by American Chemical Society</rights><rights>Copyright American Chemical Society Dec 27, 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a433t-d540e1689405bd47fabc511af1e2382a94bb89f481802dcbca0f9be1797e9dee3</citedby><cites>FETCH-LOGICAL-a433t-d540e1689405bd47fabc511af1e2382a94bb89f481802dcbca0f9be1797e9dee3</cites><orcidid>0000-0002-1415-6300 ; 0000-0001-9056-9855 ; 0000-0002-7112-7397 ; 0000000190569855 ; 0000000214156300 ; 0000000271127397</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/acs.jcim.1c01204$$EPDF$$P50$$Gacs$$H</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/acs.jcim.1c01204$$EHTML$$P50$$Gacs$$H</linktohtml><link.rule.ids>230,314,780,784,885,2765,27076,27924,27925,56738,56788</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34905348$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://www.osti.gov/biblio/1835619$$D View this record in Osti.gov$$Hfree_for_read</backlink></links><search><creatorcontrib>Feinstein, Jeremy</creatorcontrib><creatorcontrib>Sivaraman, Ganesh</creatorcontrib><creatorcontrib>Picel, Kurt</creatorcontrib><creatorcontrib>Peters, Brian</creatorcontrib><creatorcontrib>Vázquez-Mayagoitia, Álvaro</creatorcontrib><creatorcontrib>Ramanathan, Arvind</creatorcontrib><creatorcontrib>MacDonell, Margaret</creatorcontrib><creatorcontrib>Foster, Ian</creatorcontrib><creatorcontrib>Yan, Eugene</creatorcontrib><creatorcontrib>Argonne National Lab. (ANL), Argonne, IL (United States)</creatorcontrib><title>Uncertainty-Informed Deep Transfer Learning of Perfluoroalkyl and Polyfluoroalkyl Substance Toxicity</title><title>Journal of chemical information and modeling</title><addtitle>J. Chem. Inf. Model</addtitle><description>Perfluoroalkyl and polyfluoroalkyl substances (PFAS) pose a significant hazard because of their widespread industrial uses, environmental persistence, and bioaccumulation. A growing, increasingly diverse inventory of PFAS, including 8163 chemicals, has recently been updated by the U.S. Environmental Protection Agency. However, with the exception of a handful of well-studied examples, little is known about their human toxicity potential because of the substantial resources required for in vivo toxicity experiments. We tackle the problem of expensive in vivo experiments by evaluating multiple machine learning (ML) methods, including random forests, deep neural networks (DNN), graph convolutional networks, and Gaussian processes, for predicting acute toxicity (e.g., median lethal dose, or LD50) of PFAS compounds. To address the scarcity of toxicity information for PFAS, publicly available datasets of oral rat LD50 for all organic compounds are aggregated and used to develop state-of-the-art ML source models for transfer learning. A total of 519 fluorinated compounds containing two or more C-F bonds with known toxicity are used for knowledge transfer to ensembles of the best-performing source model, DNN, to generate the target models for the PFAS domain with access to uncertainty. This study predicts toxicity for PFAS with a defined chemical structure. To further inform prediction confidence, the transfer-learned model is embedded within a SelectiveNet architecture, where the model is allowed to identify regions of prediction with greater confidence and abstain from those with high uncertainty using a calibrated cutoff rate.</description><subject>Animals</subject><subject>Artificial neural networks</subject><subject>Bioaccumulation</subject><subject>Biocompatibility</subject><subject>Chemical bonds</subject><subject>Environmental protection</subject><subject>Fluorocarbons - chemistry</subject><subject>Fluorocarbons - toxicity</subject><subject>Gaussian process</subject><subject>In vivo methods and tests</subject><subject>Industrial applications</subject><subject>Knowledge management</subject><subject>Layers</subject><subject>Machine Learning</subject><subject>Machine Learning and Deep Learning</subject><subject>Molecules</subject><subject>Neural networks</subject><subject>Neural Networks, Computer</subject><subject>Organic compounds</subject><subject>Perfluoroalkyl & polyfluoroalkyl substances</subject><subject>Rats</subject><subject>Rodent models</subject><subject>Toxicity</subject><subject>Uncertainty</subject><issn>1549-9596</issn><issn>1549-960X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp1kcFr2zAUh8VYWbtu952G2S471JmeJCvWsXRbWwis0BR2E7L8tDlzpFSyof7vqzRJKYOdJMT3-z2ePkI-AJ0BZfDV2DRb2W49A0uBUfGKnEAlVKkk_fX6cK-UPCZvU1pRyrmS7A055kLRiov6hLR33mIcTOeHqbz2LsQ1tsU3xE2xjMYnh7FYoIm-87-L4IobjK4fQwym_zv1hfFtcRP66eXb7dikweTaYhkeOtsN0zty5Eyf8P3-PCV3P74vL67Kxc_L64vzRWkE50PZVoIiyFoJWjWtmDvT2ArAOEDGa2aUaJpaOVFDTVlrG2uoUw3CXM1RtYj8lHza9YY0dDrl0Wj_2OA92kFDzSsJKkNfdtAmhvsR06DXXbLY98ZjGJNmEqgAwVmV0c__oKswRp9X2FK1YEJKmSm6o2wMKUV0ehO7tYmTBqq3mnTWpLea9F5TjnzcF49N_u_nwMFLBs52wFP0MPS_fY_VSZ9E</recordid><startdate>20211227</startdate><enddate>20211227</enddate><creator>Feinstein, Jeremy</creator><creator>Sivaraman, Ganesh</creator><creator>Picel, Kurt</creator><creator>Peters, Brian</creator><creator>Vázquez-Mayagoitia, Álvaro</creator><creator>Ramanathan, Arvind</creator><creator>MacDonell, Margaret</creator><creator>Foster, Ian</creator><creator>Yan, Eugene</creator><general>American Chemical Society</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SR</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><scope>OTOTI</scope><orcidid>https://orcid.org/0000-0002-1415-6300</orcidid><orcidid>https://orcid.org/0000-0001-9056-9855</orcidid><orcidid>https://orcid.org/0000-0002-7112-7397</orcidid><orcidid>https://orcid.org/0000000190569855</orcidid><orcidid>https://orcid.org/0000000214156300</orcidid><orcidid>https://orcid.org/0000000271127397</orcidid></search><sort><creationdate>20211227</creationdate><title>Uncertainty-Informed Deep Transfer Learning of Perfluoroalkyl and Polyfluoroalkyl Substance Toxicity</title><author>Feinstein, Jeremy ; Sivaraman, Ganesh ; Picel, Kurt ; Peters, Brian ; Vázquez-Mayagoitia, Álvaro ; Ramanathan, Arvind ; MacDonell, Margaret ; Foster, Ian ; Yan, Eugene</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a433t-d540e1689405bd47fabc511af1e2382a94bb89f481802dcbca0f9be1797e9dee3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Animals</topic><topic>Artificial neural networks</topic><topic>Bioaccumulation</topic><topic>Biocompatibility</topic><topic>Chemical bonds</topic><topic>Environmental protection</topic><topic>Fluorocarbons - chemistry</topic><topic>Fluorocarbons - toxicity</topic><topic>Gaussian process</topic><topic>In vivo methods and tests</topic><topic>Industrial applications</topic><topic>Knowledge management</topic><topic>Layers</topic><topic>Machine Learning</topic><topic>Machine Learning and Deep Learning</topic><topic>Molecules</topic><topic>Neural networks</topic><topic>Neural Networks, Computer</topic><topic>Organic compounds</topic><topic>Perfluoroalkyl & polyfluoroalkyl substances</topic><topic>Rats</topic><topic>Rodent models</topic><topic>Toxicity</topic><topic>Uncertainty</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Feinstein, Jeremy</creatorcontrib><creatorcontrib>Sivaraman, Ganesh</creatorcontrib><creatorcontrib>Picel, Kurt</creatorcontrib><creatorcontrib>Peters, Brian</creatorcontrib><creatorcontrib>Vázquez-Mayagoitia, Álvaro</creatorcontrib><creatorcontrib>Ramanathan, Arvind</creatorcontrib><creatorcontrib>MacDonell, Margaret</creatorcontrib><creatorcontrib>Foster, Ian</creatorcontrib><creatorcontrib>Yan, Eugene</creatorcontrib><creatorcontrib>Argonne National Lab. (ANL), Argonne, IL (United States)</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><collection>OSTI.GOV</collection><jtitle>Journal of chemical information and modeling</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Feinstein, Jeremy</au><au>Sivaraman, Ganesh</au><au>Picel, Kurt</au><au>Peters, Brian</au><au>Vázquez-Mayagoitia, Álvaro</au><au>Ramanathan, Arvind</au><au>MacDonell, Margaret</au><au>Foster, Ian</au><au>Yan, Eugene</au><aucorp>Argonne National Lab. (ANL), Argonne, IL (United States)</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Uncertainty-Informed Deep Transfer Learning of Perfluoroalkyl and Polyfluoroalkyl Substance Toxicity</atitle><jtitle>Journal of chemical information and modeling</jtitle><addtitle>J. Chem. Inf. Model</addtitle><date>2021-12-27</date><risdate>2021</risdate><volume>61</volume><issue>12</issue><spage>5793</spage><epage>5803</epage><pages>5793-5803</pages><issn>1549-9596</issn><eissn>1549-960X</eissn><abstract>Perfluoroalkyl and polyfluoroalkyl substances (PFAS) pose a significant hazard because of their widespread industrial uses, environmental persistence, and bioaccumulation. A growing, increasingly diverse inventory of PFAS, including 8163 chemicals, has recently been updated by the U.S. Environmental Protection Agency. However, with the exception of a handful of well-studied examples, little is known about their human toxicity potential because of the substantial resources required for in vivo toxicity experiments. We tackle the problem of expensive in vivo experiments by evaluating multiple machine learning (ML) methods, including random forests, deep neural networks (DNN), graph convolutional networks, and Gaussian processes, for predicting acute toxicity (e.g., median lethal dose, or LD50) of PFAS compounds. To address the scarcity of toxicity information for PFAS, publicly available datasets of oral rat LD50 for all organic compounds are aggregated and used to develop state-of-the-art ML source models for transfer learning. A total of 519 fluorinated compounds containing two or more C-F bonds with known toxicity are used for knowledge transfer to ensembles of the best-performing source model, DNN, to generate the target models for the PFAS domain with access to uncertainty. This study predicts toxicity for PFAS with a defined chemical structure. To further inform prediction confidence, the transfer-learned model is embedded within a SelectiveNet architecture, where the model is allowed to identify regions of prediction with greater confidence and abstain from those with high uncertainty using a calibrated cutoff rate.</abstract><cop>United States</cop><pub>American Chemical Society</pub><pmid>34905348</pmid><doi>10.1021/acs.jcim.1c01204</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0002-1415-6300</orcidid><orcidid>https://orcid.org/0000-0001-9056-9855</orcidid><orcidid>https://orcid.org/0000-0002-7112-7397</orcidid><orcidid>https://orcid.org/0000000190569855</orcidid><orcidid>https://orcid.org/0000000214156300</orcidid><orcidid>https://orcid.org/0000000271127397</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1549-9596 |
ispartof | Journal of chemical information and modeling, 2021-12, Vol.61 (12), p.5793-5803 |
issn | 1549-9596 1549-960X |
language | eng |
recordid | cdi_osti_scitechconnect_1835619 |
source | MEDLINE; ACS Publications |
subjects | Animals Artificial neural networks Bioaccumulation Biocompatibility Chemical bonds Environmental protection Fluorocarbons - chemistry Fluorocarbons - toxicity Gaussian process In vivo methods and tests Industrial applications Knowledge management Layers Machine Learning Machine Learning and Deep Learning Molecules Neural networks Neural Networks, Computer Organic compounds Perfluoroalkyl & polyfluoroalkyl substances Rats Rodent models Toxicity Uncertainty |
title | Uncertainty-Informed Deep Transfer Learning of Perfluoroalkyl and Polyfluoroalkyl Substance Toxicity |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T05%3A55%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_osti_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Uncertainty-Informed%20Deep%20Transfer%20Learning%20of%20Perfluoroalkyl%20and%20Polyfluoroalkyl%20Substance%20Toxicity&rft.jtitle=Journal%20of%20chemical%20information%20and%20modeling&rft.au=Feinstein,%20Jeremy&rft.aucorp=Argonne%20National%20Lab.%20(ANL),%20Argonne,%20IL%20(United%20States)&rft.date=2021-12-27&rft.volume=61&rft.issue=12&rft.spage=5793&rft.epage=5803&rft.pages=5793-5803&rft.issn=1549-9596&rft.eissn=1549-960X&rft_id=info:doi/10.1021/acs.jcim.1c01204&rft_dat=%3Cproquest_osti_%3E2610414325%3C/proquest_osti_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2618424666&rft_id=info:pmid/34905348&rfr_iscdi=true |