High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods

The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests. In this paper, we developed QSAR methods to evaluate the narcosis toxicities of 50 phenol analogues. We first built multiple linear regression (MLR), stepwi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:RSC advances 2016-01, Vol.6 (18), p.16847-16855
Hauptverfasser: Zhou, Wei, Fan, Yanjun, Cai, Xunhui, Xiang, Yan, Jiang, Peng, Dai, Zhijun, Chen, Yuan, Tan, Siqiao, Yuan, Zheming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 16855
container_issue 18
container_start_page 16847
container_title RSC advances
container_volume 6
creator Zhou, Wei
Fan, Yanjun
Cai, Xunhui
Xiang, Yan
Jiang, Peng
Dai, Zhijun
Chen, Yuan
Tan, Siqiao
Yuan, Zheming
description The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests. In this paper, we developed QSAR methods to evaluate the narcosis toxicities of 50 phenol analogues. We first built multiple linear regression (MLR), stepwise multiple linear regression (SLR) and support vector regression (SVR) models using five descriptors and three different partitions, and the optimal SVR models with all three training-test partitions had the highest external prediction ability, about 10% higher than the models in the literature. Second, to identify more effective descriptors, we applied two in-house methods to select descriptors with clear meanings from 1264 descriptors calculated by the PCLIENT software and used them to construct the MLR, SLR and SVR models. Our results showed that our best SVR model ( R pred 2 = 0.972) significantly increased 16.55% on the test set, and the appropriate partition presented the better stability. The different partitions of the training-test datasets also supported the excellent predictive power of the best SVR model. We further evaluated the regression significance of our SVR model and the importance of each single descriptor of the model according to the interpretability analysis. Our work provided a valuable exploration of different combinations among data partition, descriptor selection and model and a useful theoretical understanding of the toxicity of phenol analogues, especially for such a small dataset. The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests.
doi_str_mv 10.1039/c6ra21076g
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1039_C6RA21076G</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1864544060</sourcerecordid><originalsourceid>FETCH-LOGICAL-c312t-e2dd241860782029334eb99c88b2a3cb9cd076cfafa8435a7aa79ecdcd61f25c3</originalsourceid><addsrcrecordid>eNp9kU1Lw0AQhoMoWGov3oX1JmJ0d5NskmMp2goFseo5TGY37Uqajbup2F_g33bbiHpyLjPMPMzHO0Fwyug1o1F-g8ICZzQVy4NgwGksQk5FfvgnPg5Gzr1SbyJhXLBB8DnTy1UIiBsLuCWPT-MFWRupakdMRRqwaJx2pDMfGnWn1T7drlRjPFGCU5KYhryD1WbjiIQOSAu286hprohUDq1uO2OJU7XCXZZAI_sRtW6WZK26lZHuJDiqoHZq9O2Hwcvd7fNkFs4fpveT8TzEiPEuVFxKHrNM0DTjlOdRFKsyzzHLSg4RljlKfz9WUEEWRwmkAGmuUKIUrOIJRsPgou_bWvO2Ua4r1tqh3wUa5S8ofOs4iWMqqEcvexStcc6qqmitXoPdFowWO8GLiViM94JPPXzWw9bhD_f7EF8__69etLKKvgAuroso</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1864544060</pqid></control><display><type>article</type><title>High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods</title><source>Royal Society Of Chemistry Journals</source><creator>Zhou, Wei ; Fan, Yanjun ; Cai, Xunhui ; Xiang, Yan ; Jiang, Peng ; Dai, Zhijun ; Chen, Yuan ; Tan, Siqiao ; Yuan, Zheming</creator><creatorcontrib>Zhou, Wei ; Fan, Yanjun ; Cai, Xunhui ; Xiang, Yan ; Jiang, Peng ; Dai, Zhijun ; Chen, Yuan ; Tan, Siqiao ; Yuan, Zheming</creatorcontrib><description>The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests. In this paper, we developed QSAR methods to evaluate the narcosis toxicities of 50 phenol analogues. We first built multiple linear regression (MLR), stepwise multiple linear regression (SLR) and support vector regression (SVR) models using five descriptors and three different partitions, and the optimal SVR models with all three training-test partitions had the highest external prediction ability, about 10% higher than the models in the literature. Second, to identify more effective descriptors, we applied two in-house methods to select descriptors with clear meanings from 1264 descriptors calculated by the PCLIENT software and used them to construct the MLR, SLR and SVR models. Our results showed that our best SVR model ( R pred 2 = 0.972) significantly increased 16.55% on the test set, and the appropriate partition presented the better stability. The different partitions of the training-test datasets also supported the excellent predictive power of the best SVR model. We further evaluated the regression significance of our SVR model and the importance of each single descriptor of the model according to the interpretability analysis. Our work provided a valuable exploration of different combinations among data partition, descriptor selection and model and a useful theoretical understanding of the toxicity of phenol analogues, especially for such a small dataset. The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests.</description><identifier>ISSN: 2046-2069</identifier><identifier>EISSN: 2046-2069</identifier><identifier>DOI: 10.1039/c6ra21076g</identifier><language>eng</language><subject>Mathematical models ; Partitions ; Phenols ; Regression ; Regression analysis ; Toxicity</subject><ispartof>RSC advances, 2016-01, Vol.6 (18), p.16847-16855</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c312t-e2dd241860782029334eb99c88b2a3cb9cd076cfafa8435a7aa79ecdcd61f25c3</citedby><cites>FETCH-LOGICAL-c312t-e2dd241860782029334eb99c88b2a3cb9cd076cfafa8435a7aa79ecdcd61f25c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,781,785,27929,27930</link.rule.ids></links><search><creatorcontrib>Zhou, Wei</creatorcontrib><creatorcontrib>Fan, Yanjun</creatorcontrib><creatorcontrib>Cai, Xunhui</creatorcontrib><creatorcontrib>Xiang, Yan</creatorcontrib><creatorcontrib>Jiang, Peng</creatorcontrib><creatorcontrib>Dai, Zhijun</creatorcontrib><creatorcontrib>Chen, Yuan</creatorcontrib><creatorcontrib>Tan, Siqiao</creatorcontrib><creatorcontrib>Yuan, Zheming</creatorcontrib><title>High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods</title><title>RSC advances</title><description>The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests. In this paper, we developed QSAR methods to evaluate the narcosis toxicities of 50 phenol analogues. We first built multiple linear regression (MLR), stepwise multiple linear regression (SLR) and support vector regression (SVR) models using five descriptors and three different partitions, and the optimal SVR models with all three training-test partitions had the highest external prediction ability, about 10% higher than the models in the literature. Second, to identify more effective descriptors, we applied two in-house methods to select descriptors with clear meanings from 1264 descriptors calculated by the PCLIENT software and used them to construct the MLR, SLR and SVR models. Our results showed that our best SVR model ( R pred 2 = 0.972) significantly increased 16.55% on the test set, and the appropriate partition presented the better stability. The different partitions of the training-test datasets also supported the excellent predictive power of the best SVR model. We further evaluated the regression significance of our SVR model and the importance of each single descriptor of the model according to the interpretability analysis. Our work provided a valuable exploration of different combinations among data partition, descriptor selection and model and a useful theoretical understanding of the toxicity of phenol analogues, especially for such a small dataset. The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests.</description><subject>Mathematical models</subject><subject>Partitions</subject><subject>Phenols</subject><subject>Regression</subject><subject>Regression analysis</subject><subject>Toxicity</subject><issn>2046-2069</issn><issn>2046-2069</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNp9kU1Lw0AQhoMoWGov3oX1JmJ0d5NskmMp2goFseo5TGY37Uqajbup2F_g33bbiHpyLjPMPMzHO0Fwyug1o1F-g8ICZzQVy4NgwGksQk5FfvgnPg5Gzr1SbyJhXLBB8DnTy1UIiBsLuCWPT-MFWRupakdMRRqwaJx2pDMfGnWn1T7drlRjPFGCU5KYhryD1WbjiIQOSAu286hprohUDq1uO2OJU7XCXZZAI_sRtW6WZK26lZHuJDiqoHZq9O2Hwcvd7fNkFs4fpveT8TzEiPEuVFxKHrNM0DTjlOdRFKsyzzHLSg4RljlKfz9WUEEWRwmkAGmuUKIUrOIJRsPgou_bWvO2Ua4r1tqh3wUa5S8ofOs4iWMqqEcvexStcc6qqmitXoPdFowWO8GLiViM94JPPXzWw9bhD_f7EF8__69etLKKvgAuroso</recordid><startdate>20160101</startdate><enddate>20160101</enddate><creator>Zhou, Wei</creator><creator>Fan, Yanjun</creator><creator>Cai, Xunhui</creator><creator>Xiang, Yan</creator><creator>Jiang, Peng</creator><creator>Dai, Zhijun</creator><creator>Chen, Yuan</creator><creator>Tan, Siqiao</creator><creator>Yuan, Zheming</creator><scope>AAYXX</scope><scope>CITATION</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope></search><sort><creationdate>20160101</creationdate><title>High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods</title><author>Zhou, Wei ; Fan, Yanjun ; Cai, Xunhui ; Xiang, Yan ; Jiang, Peng ; Dai, Zhijun ; Chen, Yuan ; Tan, Siqiao ; Yuan, Zheming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c312t-e2dd241860782029334eb99c88b2a3cb9cd076cfafa8435a7aa79ecdcd61f25c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Mathematical models</topic><topic>Partitions</topic><topic>Phenols</topic><topic>Regression</topic><topic>Regression analysis</topic><topic>Toxicity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Wei</creatorcontrib><creatorcontrib>Fan, Yanjun</creatorcontrib><creatorcontrib>Cai, Xunhui</creatorcontrib><creatorcontrib>Xiang, Yan</creatorcontrib><creatorcontrib>Jiang, Peng</creatorcontrib><creatorcontrib>Dai, Zhijun</creatorcontrib><creatorcontrib>Chen, Yuan</creatorcontrib><creatorcontrib>Tan, Siqiao</creatorcontrib><creatorcontrib>Yuan, Zheming</creatorcontrib><collection>CrossRef</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><jtitle>RSC advances</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Wei</au><au>Fan, Yanjun</au><au>Cai, Xunhui</au><au>Xiang, Yan</au><au>Jiang, Peng</au><au>Dai, Zhijun</au><au>Chen, Yuan</au><au>Tan, Siqiao</au><au>Yuan, Zheming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods</atitle><jtitle>RSC advances</jtitle><date>2016-01-01</date><risdate>2016</risdate><volume>6</volume><issue>18</issue><spage>16847</spage><epage>16855</epage><pages>16847-16855</pages><issn>2046-2069</issn><eissn>2046-2069</eissn><abstract>The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests. In this paper, we developed QSAR methods to evaluate the narcosis toxicities of 50 phenol analogues. We first built multiple linear regression (MLR), stepwise multiple linear regression (SLR) and support vector regression (SVR) models using five descriptors and three different partitions, and the optimal SVR models with all three training-test partitions had the highest external prediction ability, about 10% higher than the models in the literature. Second, to identify more effective descriptors, we applied two in-house methods to select descriptors with clear meanings from 1264 descriptors calculated by the PCLIENT software and used them to construct the MLR, SLR and SVR models. Our results showed that our best SVR model ( R pred 2 = 0.972) significantly increased 16.55% on the test set, and the appropriate partition presented the better stability. The different partitions of the training-test datasets also supported the excellent predictive power of the best SVR model. We further evaluated the regression significance of our SVR model and the importance of each single descriptor of the model according to the interpretability analysis. Our work provided a valuable exploration of different combinations among data partition, descriptor selection and model and a useful theoretical understanding of the toxicity of phenol analogues, especially for such a small dataset. The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests.</abstract><doi>10.1039/c6ra21076g</doi><tpages>9</tpages></addata></record>
fulltext fulltext
identifier ISSN: 2046-2069
ispartof RSC advances, 2016-01, Vol.6 (18), p.16847-16855
issn 2046-2069
2046-2069
language eng
recordid cdi_crossref_primary_10_1039_C6RA21076G
source Royal Society Of Chemistry Journals
subjects Mathematical models
Partitions
Phenols
Regression
Regression analysis
Toxicity
title High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-15T21%3A05%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=High-accuracy%20QSAR%20models%20of%20narcosis%20toxicities%20of%20phenols%20based%20on%20various%20data%20partition,%20descriptor%20selection%20and%20modelling%20methods&rft.jtitle=RSC%20advances&rft.au=Zhou,%20Wei&rft.date=2016-01-01&rft.volume=6&rft.issue=18&rft.spage=16847&rft.epage=16855&rft.pages=16847-16855&rft.issn=2046-2069&rft.eissn=2046-2069&rft_id=info:doi/10.1039/c6ra21076g&rft_dat=%3Cproquest_cross%3E1864544060%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1864544060&rft_id=info:pmid/&rfr_iscdi=true