High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods
The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests. In this paper, we developed QSAR methods to evaluate the narcosis toxicities of 50 phenol analogues. We first built multiple linear regression (MLR), stepwi...
Gespeichert in:
Veröffentlicht in: | RSC advances 2016-01, Vol.6 (18), p.16847-16855 |
---|---|
Hauptverfasser: | , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 16855 |
---|---|
container_issue | 18 |
container_start_page | 16847 |
container_title | RSC advances |
container_volume | 6 |
creator | Zhou, Wei Fan, Yanjun Cai, Xunhui Xiang, Yan Jiang, Peng Dai, Zhijun Chen, Yuan Tan, Siqiao Yuan, Zheming |
description | The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests. In this paper, we developed QSAR methods to evaluate the narcosis toxicities of 50 phenol analogues. We first built multiple linear regression (MLR), stepwise multiple linear regression (SLR) and support vector regression (SVR) models using five descriptors and three different partitions, and the optimal SVR models with all three training-test partitions had the highest external prediction ability, about 10% higher than the models in the literature. Second, to identify more effective descriptors, we applied two in-house methods to select descriptors with clear meanings from 1264 descriptors calculated by the PCLIENT software and used them to construct the MLR, SLR and SVR models. Our results showed that our best SVR model (
R
pred
2
= 0.972) significantly increased 16.55% on the test set, and the appropriate partition presented the better stability. The different partitions of the training-test datasets also supported the excellent predictive power of the best SVR model. We further evaluated the regression significance of our SVR model and the importance of each single descriptor of the model according to the interpretability analysis. Our work provided a valuable exploration of different combinations among data partition, descriptor selection and model and a useful theoretical understanding of the toxicity of phenol analogues, especially for such a small dataset.
The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests. |
doi_str_mv | 10.1039/c6ra21076g |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1039_C6RA21076G</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1864544060</sourcerecordid><originalsourceid>FETCH-LOGICAL-c312t-e2dd241860782029334eb99c88b2a3cb9cd076cfafa8435a7aa79ecdcd61f25c3</originalsourceid><addsrcrecordid>eNp9kU1Lw0AQhoMoWGov3oX1JmJ0d5NskmMp2goFseo5TGY37Uqajbup2F_g33bbiHpyLjPMPMzHO0Fwyug1o1F-g8ICZzQVy4NgwGksQk5FfvgnPg5Gzr1SbyJhXLBB8DnTy1UIiBsLuCWPT-MFWRupakdMRRqwaJx2pDMfGnWn1T7drlRjPFGCU5KYhryD1WbjiIQOSAu286hprohUDq1uO2OJU7XCXZZAI_sRtW6WZK26lZHuJDiqoHZq9O2Hwcvd7fNkFs4fpveT8TzEiPEuVFxKHrNM0DTjlOdRFKsyzzHLSg4RljlKfz9WUEEWRwmkAGmuUKIUrOIJRsPgou_bWvO2Ua4r1tqh3wUa5S8ofOs4iWMqqEcvexStcc6qqmitXoPdFowWO8GLiViM94JPPXzWw9bhD_f7EF8__69etLKKvgAuroso</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1864544060</pqid></control><display><type>article</type><title>High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods</title><source>Royal Society Of Chemistry Journals</source><creator>Zhou, Wei ; Fan, Yanjun ; Cai, Xunhui ; Xiang, Yan ; Jiang, Peng ; Dai, Zhijun ; Chen, Yuan ; Tan, Siqiao ; Yuan, Zheming</creator><creatorcontrib>Zhou, Wei ; Fan, Yanjun ; Cai, Xunhui ; Xiang, Yan ; Jiang, Peng ; Dai, Zhijun ; Chen, Yuan ; Tan, Siqiao ; Yuan, Zheming</creatorcontrib><description>The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests. In this paper, we developed QSAR methods to evaluate the narcosis toxicities of 50 phenol analogues. We first built multiple linear regression (MLR), stepwise multiple linear regression (SLR) and support vector regression (SVR) models using five descriptors and three different partitions, and the optimal SVR models with all three training-test partitions had the highest external prediction ability, about 10% higher than the models in the literature. Second, to identify more effective descriptors, we applied two in-house methods to select descriptors with clear meanings from 1264 descriptors calculated by the PCLIENT software and used them to construct the MLR, SLR and SVR models. Our results showed that our best SVR model (
R
pred
2
= 0.972) significantly increased 16.55% on the test set, and the appropriate partition presented the better stability. The different partitions of the training-test datasets also supported the excellent predictive power of the best SVR model. We further evaluated the regression significance of our SVR model and the importance of each single descriptor of the model according to the interpretability analysis. Our work provided a valuable exploration of different combinations among data partition, descriptor selection and model and a useful theoretical understanding of the toxicity of phenol analogues, especially for such a small dataset.
The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests.</description><identifier>ISSN: 2046-2069</identifier><identifier>EISSN: 2046-2069</identifier><identifier>DOI: 10.1039/c6ra21076g</identifier><language>eng</language><subject>Mathematical models ; Partitions ; Phenols ; Regression ; Regression analysis ; Toxicity</subject><ispartof>RSC advances, 2016-01, Vol.6 (18), p.16847-16855</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c312t-e2dd241860782029334eb99c88b2a3cb9cd076cfafa8435a7aa79ecdcd61f25c3</citedby><cites>FETCH-LOGICAL-c312t-e2dd241860782029334eb99c88b2a3cb9cd076cfafa8435a7aa79ecdcd61f25c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,781,785,27929,27930</link.rule.ids></links><search><creatorcontrib>Zhou, Wei</creatorcontrib><creatorcontrib>Fan, Yanjun</creatorcontrib><creatorcontrib>Cai, Xunhui</creatorcontrib><creatorcontrib>Xiang, Yan</creatorcontrib><creatorcontrib>Jiang, Peng</creatorcontrib><creatorcontrib>Dai, Zhijun</creatorcontrib><creatorcontrib>Chen, Yuan</creatorcontrib><creatorcontrib>Tan, Siqiao</creatorcontrib><creatorcontrib>Yuan, Zheming</creatorcontrib><title>High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods</title><title>RSC advances</title><description>The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests. In this paper, we developed QSAR methods to evaluate the narcosis toxicities of 50 phenol analogues. We first built multiple linear regression (MLR), stepwise multiple linear regression (SLR) and support vector regression (SVR) models using five descriptors and three different partitions, and the optimal SVR models with all three training-test partitions had the highest external prediction ability, about 10% higher than the models in the literature. Second, to identify more effective descriptors, we applied two in-house methods to select descriptors with clear meanings from 1264 descriptors calculated by the PCLIENT software and used them to construct the MLR, SLR and SVR models. Our results showed that our best SVR model (
R
pred
2
= 0.972) significantly increased 16.55% on the test set, and the appropriate partition presented the better stability. The different partitions of the training-test datasets also supported the excellent predictive power of the best SVR model. We further evaluated the regression significance of our SVR model and the importance of each single descriptor of the model according to the interpretability analysis. Our work provided a valuable exploration of different combinations among data partition, descriptor selection and model and a useful theoretical understanding of the toxicity of phenol analogues, especially for such a small dataset.
The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests.</description><subject>Mathematical models</subject><subject>Partitions</subject><subject>Phenols</subject><subject>Regression</subject><subject>Regression analysis</subject><subject>Toxicity</subject><issn>2046-2069</issn><issn>2046-2069</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNp9kU1Lw0AQhoMoWGov3oX1JmJ0d5NskmMp2goFseo5TGY37Uqajbup2F_g33bbiHpyLjPMPMzHO0Fwyug1o1F-g8ICZzQVy4NgwGksQk5FfvgnPg5Gzr1SbyJhXLBB8DnTy1UIiBsLuCWPT-MFWRupakdMRRqwaJx2pDMfGnWn1T7drlRjPFGCU5KYhryD1WbjiIQOSAu286hprohUDq1uO2OJU7XCXZZAI_sRtW6WZK26lZHuJDiqoHZq9O2Hwcvd7fNkFs4fpveT8TzEiPEuVFxKHrNM0DTjlOdRFKsyzzHLSg4RljlKfz9WUEEWRwmkAGmuUKIUrOIJRsPgou_bWvO2Ua4r1tqh3wUa5S8ofOs4iWMqqEcvexStcc6qqmitXoPdFowWO8GLiViM94JPPXzWw9bhD_f7EF8__69etLKKvgAuroso</recordid><startdate>20160101</startdate><enddate>20160101</enddate><creator>Zhou, Wei</creator><creator>Fan, Yanjun</creator><creator>Cai, Xunhui</creator><creator>Xiang, Yan</creator><creator>Jiang, Peng</creator><creator>Dai, Zhijun</creator><creator>Chen, Yuan</creator><creator>Tan, Siqiao</creator><creator>Yuan, Zheming</creator><scope>AAYXX</scope><scope>CITATION</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope></search><sort><creationdate>20160101</creationdate><title>High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods</title><author>Zhou, Wei ; Fan, Yanjun ; Cai, Xunhui ; Xiang, Yan ; Jiang, Peng ; Dai, Zhijun ; Chen, Yuan ; Tan, Siqiao ; Yuan, Zheming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c312t-e2dd241860782029334eb99c88b2a3cb9cd076cfafa8435a7aa79ecdcd61f25c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Mathematical models</topic><topic>Partitions</topic><topic>Phenols</topic><topic>Regression</topic><topic>Regression analysis</topic><topic>Toxicity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Wei</creatorcontrib><creatorcontrib>Fan, Yanjun</creatorcontrib><creatorcontrib>Cai, Xunhui</creatorcontrib><creatorcontrib>Xiang, Yan</creatorcontrib><creatorcontrib>Jiang, Peng</creatorcontrib><creatorcontrib>Dai, Zhijun</creatorcontrib><creatorcontrib>Chen, Yuan</creatorcontrib><creatorcontrib>Tan, Siqiao</creatorcontrib><creatorcontrib>Yuan, Zheming</creatorcontrib><collection>CrossRef</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><jtitle>RSC advances</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Wei</au><au>Fan, Yanjun</au><au>Cai, Xunhui</au><au>Xiang, Yan</au><au>Jiang, Peng</au><au>Dai, Zhijun</au><au>Chen, Yuan</au><au>Tan, Siqiao</au><au>Yuan, Zheming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods</atitle><jtitle>RSC advances</jtitle><date>2016-01-01</date><risdate>2016</risdate><volume>6</volume><issue>18</issue><spage>16847</spage><epage>16855</epage><pages>16847-16855</pages><issn>2046-2069</issn><eissn>2046-2069</eissn><abstract>The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests. In this paper, we developed QSAR methods to evaluate the narcosis toxicities of 50 phenol analogues. We first built multiple linear regression (MLR), stepwise multiple linear regression (SLR) and support vector regression (SVR) models using five descriptors and three different partitions, and the optimal SVR models with all three training-test partitions had the highest external prediction ability, about 10% higher than the models in the literature. Second, to identify more effective descriptors, we applied two in-house methods to select descriptors with clear meanings from 1264 descriptors calculated by the PCLIENT software and used them to construct the MLR, SLR and SVR models. Our results showed that our best SVR model (
R
pred
2
= 0.972) significantly increased 16.55% on the test set, and the appropriate partition presented the better stability. The different partitions of the training-test datasets also supported the excellent predictive power of the best SVR model. We further evaluated the regression significance of our SVR model and the importance of each single descriptor of the model according to the interpretability analysis. Our work provided a valuable exploration of different combinations among data partition, descriptor selection and model and a useful theoretical understanding of the toxicity of phenol analogues, especially for such a small dataset.
The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests.</abstract><doi>10.1039/c6ra21076g</doi><tpages>9</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2046-2069 |
ispartof | RSC advances, 2016-01, Vol.6 (18), p.16847-16855 |
issn | 2046-2069 2046-2069 |
language | eng |
recordid | cdi_crossref_primary_10_1039_C6RA21076G |
source | Royal Society Of Chemistry Journals |
subjects | Mathematical models Partitions Phenols Regression Regression analysis Toxicity |
title | High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-15T21%3A05%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=High-accuracy%20QSAR%20models%20of%20narcosis%20toxicities%20of%20phenols%20based%20on%20various%20data%20partition,%20descriptor%20selection%20and%20modelling%20methods&rft.jtitle=RSC%20advances&rft.au=Zhou,%20Wei&rft.date=2016-01-01&rft.volume=6&rft.issue=18&rft.spage=16847&rft.epage=16855&rft.pages=16847-16855&rft.issn=2046-2069&rft.eissn=2046-2069&rft_id=info:doi/10.1039/c6ra21076g&rft_dat=%3Cproquest_cross%3E1864544060%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1864544060&rft_id=info:pmid/&rfr_iscdi=true |