Assessing Graph‐based Deep Learning Models for Predicting Flash Point

Flash points of organic molecules play an important role in preventing flammability hazards and large databases of measured values exist, although millions of compounds remain unmeasured. To rapidly extend existing data to new compounds many researchers have used quantitative structure‐property rela...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Molecular informatics 2020-06, Vol.39 (6), p.e1900101-n/a
Hauptverfasser: Sun, Xiaoyu, Krakauer, Nathaniel J., Politowicz, Alexander, Chen, Wei‐Ting, Li, Qiying, Li, Zuoyi, Shao, Xianjia, Sunaryo, Alfred, Shen, Mingren, Wang, James, Morgan, Dane
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page n/a
container_issue 6
container_start_page e1900101
container_title Molecular informatics
container_volume 39
creator Sun, Xiaoyu
Krakauer, Nathaniel J.
Politowicz, Alexander
Chen, Wei‐Ting
Li, Qiying
Li, Zuoyi
Shao, Xianjia
Sunaryo, Alfred
Shen, Mingren
Wang, James
Morgan, Dane
description Flash points of organic molecules play an important role in preventing flammability hazards and large databases of measured values exist, although millions of compounds remain unmeasured. To rapidly extend existing data to new compounds many researchers have used quantitative structure‐property relationship (QSPR) analysis to effectively predict flash points. In recent years graph‐based deep learning (GBDL) has emerged as a powerful alternative method to traditional QSPR. In this paper, GBDL models were implemented in predicting flash point for the first time. We assessed the performance of two GBDL models, message‐passing neural network (MPNN) and graph convolutional neural network (GCNN), by comparing against 12 previous QSPR studies using more traditional methods. Our result shows that MPNN both outperforms GCNN and yields slightly worse but comparable performance with previous QSPR studies. The average R2 and Mean Absolute Error (MAE) scores of MPNN are, respectively, 2.3 % lower and 2.0 K higher than previous comparable studies. To further explore GBDL models, we collected the largest flash point dataset to date, which contains 10575 unique molecules. The optimized MPNN gives a test data R2 of 0.803 and MAE of 17.8 K on the complete dataset. We also extracted 5 datasets from our integrated dataset based on molecular types (acids, organometallics, organogermaniums, organosilicons, and organotins) and explore the quality of the model in these classes.
doi_str_mv 10.1002/minf.201900101
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2359393371</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2359393371</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4101-d31033d292145a520b9cdc8d3356b4a192e67df4a7cd0ab344cb3fc4465384db3</originalsourceid><addsrcrecordid>eNqFkD1PwzAQhi0EolXpyogisbCk-CtxPFaFlkotdIDZcmyHpsoXdiPUjZ_Ab-SX4KilSCzccqe7517dvQBcIjhCEOLbMq-yEYaIQ4ggOgF9lMRJiFiETo81JT0wdG4DfRAcs4Sfgx7BkDFMoj6YjZ0zzuXVazCzsll_fXym0hkd3BnTBAsjbdXNlrU2hQuy2gYra3Sutl13Wki3DlZ1Xm0vwFkmC2eGhzwAL9P758lDuHiazSfjRaioPzHUBEFCNOYY0UhGGKZcaZVoQqI4pRJxbGKmMyqZ0lCmhFKVkkxRGkckoTolA3Cz121s_dYatxVl7pQpClmZunXCP8UJJ4Qhj17_QTd1ayt_ncAUJozHMceeGu0pZWvnrMlEY_NS2p1AUHQui85lcXTZL1wdZNu0NPqI_3jqAb4H3vPC7P6RE8v54_RX_BtCB4eS</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2408796692</pqid></control><display><type>article</type><title>Assessing Graph‐based Deep Learning Models for Predicting Flash Point</title><source>Wiley Online Library Journals Frontfile Complete</source><creator>Sun, Xiaoyu ; Krakauer, Nathaniel J. ; Politowicz, Alexander ; Chen, Wei‐Ting ; Li, Qiying ; Li, Zuoyi ; Shao, Xianjia ; Sunaryo, Alfred ; Shen, Mingren ; Wang, James ; Morgan, Dane</creator><creatorcontrib>Sun, Xiaoyu ; Krakauer, Nathaniel J. ; Politowicz, Alexander ; Chen, Wei‐Ting ; Li, Qiying ; Li, Zuoyi ; Shao, Xianjia ; Sunaryo, Alfred ; Shen, Mingren ; Wang, James ; Morgan, Dane</creatorcontrib><description>Flash points of organic molecules play an important role in preventing flammability hazards and large databases of measured values exist, although millions of compounds remain unmeasured. To rapidly extend existing data to new compounds many researchers have used quantitative structure‐property relationship (QSPR) analysis to effectively predict flash points. In recent years graph‐based deep learning (GBDL) has emerged as a powerful alternative method to traditional QSPR. In this paper, GBDL models were implemented in predicting flash point for the first time. We assessed the performance of two GBDL models, message‐passing neural network (MPNN) and graph convolutional neural network (GCNN), by comparing against 12 previous QSPR studies using more traditional methods. Our result shows that MPNN both outperforms GCNN and yields slightly worse but comparable performance with previous QSPR studies. The average R2 and Mean Absolute Error (MAE) scores of MPNN are, respectively, 2.3 % lower and 2.0 K higher than previous comparable studies. To further explore GBDL models, we collected the largest flash point dataset to date, which contains 10575 unique molecules. The optimized MPNN gives a test data R2 of 0.803 and MAE of 17.8 K on the complete dataset. We also extracted 5 datasets from our integrated dataset based on molecular types (acids, organometallics, organogermaniums, organosilicons, and organotins) and explore the quality of the model in these classes.</description><identifier>ISSN: 1868-1743</identifier><identifier>EISSN: 1868-1751</identifier><identifier>DOI: 10.1002/minf.201900101</identifier><identifier>PMID: 32077235</identifier><language>eng</language><publisher>Germany: Wiley Subscription Services, Inc</publisher><subject>Artificial neural networks ; Datasets ; Deep learning ; Domain of applicability ; Flammability ; Flash point ; Machine learning ; Message passing ; Neural network ; Neural networks ; Organic chemistry ; Organometallic compounds ; Quantitative structure-property relationship ; Robust model prediction</subject><ispartof>Molecular informatics, 2020-06, Vol.39 (6), p.e1900101-n/a</ispartof><rights>2020 Wiley‐VCH Verlag GmbH &amp; Co. KGaA, Weinheim</rights><rights>2020 Wiley-VCH Verlag GmbH &amp; Co. KGaA, Weinheim.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4101-d31033d292145a520b9cdc8d3356b4a192e67df4a7cd0ab344cb3fc4465384db3</citedby><cites>FETCH-LOGICAL-c4101-d31033d292145a520b9cdc8d3356b4a192e67df4a7cd0ab344cb3fc4465384db3</cites><orcidid>0000-0002-4911-0046</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fminf.201900101$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fminf.201900101$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,780,784,1416,27922,27923,45572,45573</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32077235$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Sun, Xiaoyu</creatorcontrib><creatorcontrib>Krakauer, Nathaniel J.</creatorcontrib><creatorcontrib>Politowicz, Alexander</creatorcontrib><creatorcontrib>Chen, Wei‐Ting</creatorcontrib><creatorcontrib>Li, Qiying</creatorcontrib><creatorcontrib>Li, Zuoyi</creatorcontrib><creatorcontrib>Shao, Xianjia</creatorcontrib><creatorcontrib>Sunaryo, Alfred</creatorcontrib><creatorcontrib>Shen, Mingren</creatorcontrib><creatorcontrib>Wang, James</creatorcontrib><creatorcontrib>Morgan, Dane</creatorcontrib><title>Assessing Graph‐based Deep Learning Models for Predicting Flash Point</title><title>Molecular informatics</title><addtitle>Mol Inform</addtitle><description>Flash points of organic molecules play an important role in preventing flammability hazards and large databases of measured values exist, although millions of compounds remain unmeasured. To rapidly extend existing data to new compounds many researchers have used quantitative structure‐property relationship (QSPR) analysis to effectively predict flash points. In recent years graph‐based deep learning (GBDL) has emerged as a powerful alternative method to traditional QSPR. In this paper, GBDL models were implemented in predicting flash point for the first time. We assessed the performance of two GBDL models, message‐passing neural network (MPNN) and graph convolutional neural network (GCNN), by comparing against 12 previous QSPR studies using more traditional methods. Our result shows that MPNN both outperforms GCNN and yields slightly worse but comparable performance with previous QSPR studies. The average R2 and Mean Absolute Error (MAE) scores of MPNN are, respectively, 2.3 % lower and 2.0 K higher than previous comparable studies. To further explore GBDL models, we collected the largest flash point dataset to date, which contains 10575 unique molecules. The optimized MPNN gives a test data R2 of 0.803 and MAE of 17.8 K on the complete dataset. We also extracted 5 datasets from our integrated dataset based on molecular types (acids, organometallics, organogermaniums, organosilicons, and organotins) and explore the quality of the model in these classes.</description><subject>Artificial neural networks</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Domain of applicability</subject><subject>Flammability</subject><subject>Flash point</subject><subject>Machine learning</subject><subject>Message passing</subject><subject>Neural network</subject><subject>Neural networks</subject><subject>Organic chemistry</subject><subject>Organometallic compounds</subject><subject>Quantitative structure-property relationship</subject><subject>Robust model prediction</subject><issn>1868-1743</issn><issn>1868-1751</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNqFkD1PwzAQhi0EolXpyogisbCk-CtxPFaFlkotdIDZcmyHpsoXdiPUjZ_Ab-SX4KilSCzccqe7517dvQBcIjhCEOLbMq-yEYaIQ4ggOgF9lMRJiFiETo81JT0wdG4DfRAcs4Sfgx7BkDFMoj6YjZ0zzuXVazCzsll_fXym0hkd3BnTBAsjbdXNlrU2hQuy2gYra3Sutl13Wki3DlZ1Xm0vwFkmC2eGhzwAL9P758lDuHiazSfjRaioPzHUBEFCNOYY0UhGGKZcaZVoQqI4pRJxbGKmMyqZ0lCmhFKVkkxRGkckoTolA3Cz121s_dYatxVl7pQpClmZunXCP8UJJ4Qhj17_QTd1ayt_ncAUJozHMceeGu0pZWvnrMlEY_NS2p1AUHQui85lcXTZL1wdZNu0NPqI_3jqAb4H3vPC7P6RE8v54_RX_BtCB4eS</recordid><startdate>202006</startdate><enddate>202006</enddate><creator>Sun, Xiaoyu</creator><creator>Krakauer, Nathaniel J.</creator><creator>Politowicz, Alexander</creator><creator>Chen, Wei‐Ting</creator><creator>Li, Qiying</creator><creator>Li, Zuoyi</creator><creator>Shao, Xianjia</creator><creator>Sunaryo, Alfred</creator><creator>Shen, Mingren</creator><creator>Wang, James</creator><creator>Morgan, Dane</creator><general>Wiley Subscription Services, Inc</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7TM</scope><scope>7U7</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>JQ2</scope><scope>K9.</scope><scope>P64</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-4911-0046</orcidid></search><sort><creationdate>202006</creationdate><title>Assessing Graph‐based Deep Learning Models for Predicting Flash Point</title><author>Sun, Xiaoyu ; Krakauer, Nathaniel J. ; Politowicz, Alexander ; Chen, Wei‐Ting ; Li, Qiying ; Li, Zuoyi ; Shao, Xianjia ; Sunaryo, Alfred ; Shen, Mingren ; Wang, James ; Morgan, Dane</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4101-d31033d292145a520b9cdc8d3356b4a192e67df4a7cd0ab344cb3fc4465384db3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Artificial neural networks</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Domain of applicability</topic><topic>Flammability</topic><topic>Flash point</topic><topic>Machine learning</topic><topic>Message passing</topic><topic>Neural network</topic><topic>Neural networks</topic><topic>Organic chemistry</topic><topic>Organometallic compounds</topic><topic>Quantitative structure-property relationship</topic><topic>Robust model prediction</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sun, Xiaoyu</creatorcontrib><creatorcontrib>Krakauer, Nathaniel J.</creatorcontrib><creatorcontrib>Politowicz, Alexander</creatorcontrib><creatorcontrib>Chen, Wei‐Ting</creatorcontrib><creatorcontrib>Li, Qiying</creatorcontrib><creatorcontrib>Li, Zuoyi</creatorcontrib><creatorcontrib>Shao, Xianjia</creatorcontrib><creatorcontrib>Sunaryo, Alfred</creatorcontrib><creatorcontrib>Shen, Mingren</creatorcontrib><creatorcontrib>Wang, James</creatorcontrib><creatorcontrib>Morgan, Dane</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Toxicology Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Molecular informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sun, Xiaoyu</au><au>Krakauer, Nathaniel J.</au><au>Politowicz, Alexander</au><au>Chen, Wei‐Ting</au><au>Li, Qiying</au><au>Li, Zuoyi</au><au>Shao, Xianjia</au><au>Sunaryo, Alfred</au><au>Shen, Mingren</au><au>Wang, James</au><au>Morgan, Dane</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Assessing Graph‐based Deep Learning Models for Predicting Flash Point</atitle><jtitle>Molecular informatics</jtitle><addtitle>Mol Inform</addtitle><date>2020-06</date><risdate>2020</risdate><volume>39</volume><issue>6</issue><spage>e1900101</spage><epage>n/a</epage><pages>e1900101-n/a</pages><issn>1868-1743</issn><eissn>1868-1751</eissn><abstract>Flash points of organic molecules play an important role in preventing flammability hazards and large databases of measured values exist, although millions of compounds remain unmeasured. To rapidly extend existing data to new compounds many researchers have used quantitative structure‐property relationship (QSPR) analysis to effectively predict flash points. In recent years graph‐based deep learning (GBDL) has emerged as a powerful alternative method to traditional QSPR. In this paper, GBDL models were implemented in predicting flash point for the first time. We assessed the performance of two GBDL models, message‐passing neural network (MPNN) and graph convolutional neural network (GCNN), by comparing against 12 previous QSPR studies using more traditional methods. Our result shows that MPNN both outperforms GCNN and yields slightly worse but comparable performance with previous QSPR studies. The average R2 and Mean Absolute Error (MAE) scores of MPNN are, respectively, 2.3 % lower and 2.0 K higher than previous comparable studies. To further explore GBDL models, we collected the largest flash point dataset to date, which contains 10575 unique molecules. The optimized MPNN gives a test data R2 of 0.803 and MAE of 17.8 K on the complete dataset. We also extracted 5 datasets from our integrated dataset based on molecular types (acids, organometallics, organogermaniums, organosilicons, and organotins) and explore the quality of the model in these classes.</abstract><cop>Germany</cop><pub>Wiley Subscription Services, Inc</pub><pmid>32077235</pmid><doi>10.1002/minf.201900101</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-4911-0046</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1868-1743
ispartof Molecular informatics, 2020-06, Vol.39 (6), p.e1900101-n/a
issn 1868-1743
1868-1751
language eng
recordid cdi_proquest_miscellaneous_2359393371
source Wiley Online Library Journals Frontfile Complete
subjects Artificial neural networks
Datasets
Deep learning
Domain of applicability
Flammability
Flash point
Machine learning
Message passing
Neural network
Neural networks
Organic chemistry
Organometallic compounds
Quantitative structure-property relationship
Robust model prediction
title Assessing Graph‐based Deep Learning Models for Predicting Flash Point
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T02%3A08%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Assessing%20Graph%E2%80%90based%20Deep%20Learning%20Models%20for%20Predicting%20Flash%20Point&rft.jtitle=Molecular%20informatics&rft.au=Sun,%20Xiaoyu&rft.date=2020-06&rft.volume=39&rft.issue=6&rft.spage=e1900101&rft.epage=n/a&rft.pages=e1900101-n/a&rft.issn=1868-1743&rft.eissn=1868-1751&rft_id=info:doi/10.1002/minf.201900101&rft_dat=%3Cproquest_cross%3E2359393371%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2408796692&rft_id=info:pmid/32077235&rfr_iscdi=true