Toward an Optimal Procedure for PC-ANN Model Building: Prediction of the Carcinogenic Activity of a Large Set of Drugs
The performances of the three novel QSAR algorithms, principal component-artificial neural network modeling method combining with three factor selection procedures named eigenvalue ranking, correlation ranking, and genetic algorithm (ER-PC-ANN, CR-PC-ANN, PC-GA-ANN, respectively), are compared by ap...
Gespeichert in:
Veröffentlicht in: | Journal of chemical information and modeling 2005-01, Vol.45 (1), p.190-199 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 199 |
---|---|
container_issue | 1 |
container_start_page | 190 |
container_title | Journal of chemical information and modeling |
container_volume | 45 |
creator | Hemmateenejad, Bahram Safarpour, Mohammad A Miri, Ramin Nesari, Nasim |
description | The performances of the three novel QSAR algorithms, principal component-artificial neural network modeling method combining with three factor selection procedures named eigenvalue ranking, correlation ranking, and genetic algorithm (ER-PC-ANN, CR-PC-ANN, PC-GA-ANN, respectively), are compared by application of these model to the prediction of the carcinogenic activity of a large set of drugs (735 drugs) belonging to a diverse type of compounds. A total number of 1350 theoretical descriptors are calculated for each molecule. The matrix of calculated descriptors (with 735 × 1350 dimension) is subjected to PCA. 95% of the variances in the matrix are explained by the first 137 principal components (PC's). From the pool of 137 PC's, the factor selection methods (ER, CR, and GA) are employed to select the best set of PC's for PC-ANN modeling. In the ER-PC-ANN, the PC's are successively entered into the ANN based on their decreasing eigenvalue. In the CR-PC-ANN, the ANN is first employed to model the nonlinear relationship between each one of the PC's and the carcinogen activity separately. Then, the PC's are ranked based on their decreasing correlating ability and entered to the input layer of the network one after another. Finally, a search algorithm (i.e. genetic algorithm) is used to find the best set of PC's. Both the external and cross-validation methods are used to validate the performances of the resulting models. One is able to see that the results obtained by the PC-GA-ANN and CR-PC-ANN procedures are superior to those resulted from the EV-PC-ANN. Comparison of the results reveals that the results produced by the PC-GA-ANN algorithm are better than those produced by CR-PC-ANN. However, the difference is not significant. |
doi_str_mv | 10.1021/ci049766z |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_216243317</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>854039561</sourcerecordid><originalsourceid>FETCH-LOGICAL-a378t-fe3c28b07b33e738673ba8a7978b655a22c521e2b51436f4904f68259904997c3</originalsourceid><addsrcrecordid>eNplkEtPGzEUhS1UxHvRP4CsSixYDPVjbI_ZhVAeUgqpCKg7y-PxBEMYB3umbVix5W_yS3CUAIuu7r06n865OgB8xegAI4K_G4dyKTh_WgEbmOUykxz9_vK-M8nXwWaMdwhRKjlZA-uYcS5wzjbAbOT_6lBB3cDLaese9AQOgze26oKFtQ9w2M96Fxfwp6_sBB51blK5Znz4-vySOFs50zrfQF_D9tbCvg7GNX5sG2dgL0l_XDubixoOdBhbeGXb-XkcunHcBqu1nkS7s5xb4Prkx6h_lg0uT8_7vUGmqSjarLbUkKJEoqTUClpwQUtdaCFFUXLGNCGGEWxJyXBOeZ1LlNe8IEymRUph6Bb4tvCdBv_Y2diqO9-FJkUqgjnJKcUiQfsLyAQfY7C1moZURpgpjNS8Y_XRcWJ3l4Zd-WCrT3JZagKyBeBia_996Drcq_S9YGo0vFJH5Iagk5tfapT4vQWvTfx87v_gN_tYkMI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>216243317</pqid></control><display><type>article</type><title>Toward an Optimal Procedure for PC-ANN Model Building: Prediction of the Carcinogenic Activity of a Large Set of Drugs</title><source>MEDLINE</source><source>ACS Publications</source><creator>Hemmateenejad, Bahram ; Safarpour, Mohammad A ; Miri, Ramin ; Nesari, Nasim</creator><creatorcontrib>Hemmateenejad, Bahram ; Safarpour, Mohammad A ; Miri, Ramin ; Nesari, Nasim</creatorcontrib><description>The performances of the three novel QSAR algorithms, principal component-artificial neural network modeling method combining with three factor selection procedures named eigenvalue ranking, correlation ranking, and genetic algorithm (ER-PC-ANN, CR-PC-ANN, PC-GA-ANN, respectively), are compared by application of these model to the prediction of the carcinogenic activity of a large set of drugs (735 drugs) belonging to a diverse type of compounds. A total number of 1350 theoretical descriptors are calculated for each molecule. The matrix of calculated descriptors (with 735 × 1350 dimension) is subjected to PCA. 95% of the variances in the matrix are explained by the first 137 principal components (PC's). From the pool of 137 PC's, the factor selection methods (ER, CR, and GA) are employed to select the best set of PC's for PC-ANN modeling. In the ER-PC-ANN, the PC's are successively entered into the ANN based on their decreasing eigenvalue. In the CR-PC-ANN, the ANN is first employed to model the nonlinear relationship between each one of the PC's and the carcinogen activity separately. Then, the PC's are ranked based on their decreasing correlating ability and entered to the input layer of the network one after another. Finally, a search algorithm (i.e. genetic algorithm) is used to find the best set of PC's. Both the external and cross-validation methods are used to validate the performances of the resulting models. One is able to see that the results obtained by the PC-GA-ANN and CR-PC-ANN procedures are superior to those resulted from the EV-PC-ANN. Comparison of the results reveals that the results produced by the PC-GA-ANN algorithm are better than those produced by CR-PC-ANN. However, the difference is not significant.</description><identifier>ISSN: 1549-9596</identifier><identifier>EISSN: 1549-960X</identifier><identifier>DOI: 10.1021/ci049766z</identifier><identifier>PMID: 15667145</identifier><language>eng</language><publisher>United States: American Chemical Society</publisher><subject>Algorithms ; Bioinformatics ; Carcinogens ; Carcinogens - chemistry ; Carcinogens - toxicity ; Comparative analysis ; Computer based modeling ; Models, Chemical ; Neural Networks (Computer) ; Pharmaceuticals ; Principal Component Analysis ; Quantitative Structure-Activity Relationship</subject><ispartof>Journal of chemical information and modeling, 2005-01, Vol.45 (1), p.190-199</ispartof><rights>Copyright © 2005 American Chemical Society</rights><rights>Copyright American Chemical Society Jan/Feb 2005</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a378t-fe3c28b07b33e738673ba8a7978b655a22c521e2b51436f4904f68259904997c3</citedby><cites>FETCH-LOGICAL-a378t-fe3c28b07b33e738673ba8a7978b655a22c521e2b51436f4904f68259904997c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/ci049766z$$EPDF$$P50$$Gacs$$H</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/ci049766z$$EHTML$$P50$$Gacs$$H</linktohtml><link.rule.ids>314,780,784,2764,27075,27923,27924,56737,56787</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/15667145$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Hemmateenejad, Bahram</creatorcontrib><creatorcontrib>Safarpour, Mohammad A</creatorcontrib><creatorcontrib>Miri, Ramin</creatorcontrib><creatorcontrib>Nesari, Nasim</creatorcontrib><title>Toward an Optimal Procedure for PC-ANN Model Building: Prediction of the Carcinogenic Activity of a Large Set of Drugs</title><title>Journal of chemical information and modeling</title><addtitle>J. Chem. Inf. Model</addtitle><description>The performances of the three novel QSAR algorithms, principal component-artificial neural network modeling method combining with three factor selection procedures named eigenvalue ranking, correlation ranking, and genetic algorithm (ER-PC-ANN, CR-PC-ANN, PC-GA-ANN, respectively), are compared by application of these model to the prediction of the carcinogenic activity of a large set of drugs (735 drugs) belonging to a diverse type of compounds. A total number of 1350 theoretical descriptors are calculated for each molecule. The matrix of calculated descriptors (with 735 × 1350 dimension) is subjected to PCA. 95% of the variances in the matrix are explained by the first 137 principal components (PC's). From the pool of 137 PC's, the factor selection methods (ER, CR, and GA) are employed to select the best set of PC's for PC-ANN modeling. In the ER-PC-ANN, the PC's are successively entered into the ANN based on their decreasing eigenvalue. In the CR-PC-ANN, the ANN is first employed to model the nonlinear relationship between each one of the PC's and the carcinogen activity separately. Then, the PC's are ranked based on their decreasing correlating ability and entered to the input layer of the network one after another. Finally, a search algorithm (i.e. genetic algorithm) is used to find the best set of PC's. Both the external and cross-validation methods are used to validate the performances of the resulting models. One is able to see that the results obtained by the PC-GA-ANN and CR-PC-ANN procedures are superior to those resulted from the EV-PC-ANN. Comparison of the results reveals that the results produced by the PC-GA-ANN algorithm are better than those produced by CR-PC-ANN. However, the difference is not significant.</description><subject>Algorithms</subject><subject>Bioinformatics</subject><subject>Carcinogens</subject><subject>Carcinogens - chemistry</subject><subject>Carcinogens - toxicity</subject><subject>Comparative analysis</subject><subject>Computer based modeling</subject><subject>Models, Chemical</subject><subject>Neural Networks (Computer)</subject><subject>Pharmaceuticals</subject><subject>Principal Component Analysis</subject><subject>Quantitative Structure-Activity Relationship</subject><issn>1549-9596</issn><issn>1549-960X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2005</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNplkEtPGzEUhS1UxHvRP4CsSixYDPVjbI_ZhVAeUgqpCKg7y-PxBEMYB3umbVix5W_yS3CUAIuu7r06n865OgB8xegAI4K_G4dyKTh_WgEbmOUykxz9_vK-M8nXwWaMdwhRKjlZA-uYcS5wzjbAbOT_6lBB3cDLaese9AQOgze26oKFtQ9w2M96Fxfwp6_sBB51blK5Znz4-vySOFs50zrfQF_D9tbCvg7GNX5sG2dgL0l_XDubixoOdBhbeGXb-XkcunHcBqu1nkS7s5xb4Prkx6h_lg0uT8_7vUGmqSjarLbUkKJEoqTUClpwQUtdaCFFUXLGNCGGEWxJyXBOeZ1LlNe8IEymRUph6Bb4tvCdBv_Y2diqO9-FJkUqgjnJKcUiQfsLyAQfY7C1moZURpgpjNS8Y_XRcWJ3l4Zd-WCrT3JZagKyBeBia_996Drcq_S9YGo0vFJH5Iagk5tfapT4vQWvTfx87v_gN_tYkMI</recordid><startdate>20050101</startdate><enddate>20050101</enddate><creator>Hemmateenejad, Bahram</creator><creator>Safarpour, Mohammad A</creator><creator>Miri, Ramin</creator><creator>Nesari, Nasim</creator><general>American Chemical Society</general><scope>BSCLL</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SR</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20050101</creationdate><title>Toward an Optimal Procedure for PC-ANN Model Building: Prediction of the Carcinogenic Activity of a Large Set of Drugs</title><author>Hemmateenejad, Bahram ; Safarpour, Mohammad A ; Miri, Ramin ; Nesari, Nasim</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a378t-fe3c28b07b33e738673ba8a7978b655a22c521e2b51436f4904f68259904997c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Algorithms</topic><topic>Bioinformatics</topic><topic>Carcinogens</topic><topic>Carcinogens - chemistry</topic><topic>Carcinogens - toxicity</topic><topic>Comparative analysis</topic><topic>Computer based modeling</topic><topic>Models, Chemical</topic><topic>Neural Networks (Computer)</topic><topic>Pharmaceuticals</topic><topic>Principal Component Analysis</topic><topic>Quantitative Structure-Activity Relationship</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hemmateenejad, Bahram</creatorcontrib><creatorcontrib>Safarpour, Mohammad A</creatorcontrib><creatorcontrib>Miri, Ramin</creatorcontrib><creatorcontrib>Nesari, Nasim</creatorcontrib><collection>Istex</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of chemical information and modeling</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hemmateenejad, Bahram</au><au>Safarpour, Mohammad A</au><au>Miri, Ramin</au><au>Nesari, Nasim</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Toward an Optimal Procedure for PC-ANN Model Building: Prediction of the Carcinogenic Activity of a Large Set of Drugs</atitle><jtitle>Journal of chemical information and modeling</jtitle><addtitle>J. Chem. Inf. Model</addtitle><date>2005-01-01</date><risdate>2005</risdate><volume>45</volume><issue>1</issue><spage>190</spage><epage>199</epage><pages>190-199</pages><issn>1549-9596</issn><eissn>1549-960X</eissn><abstract>The performances of the three novel QSAR algorithms, principal component-artificial neural network modeling method combining with three factor selection procedures named eigenvalue ranking, correlation ranking, and genetic algorithm (ER-PC-ANN, CR-PC-ANN, PC-GA-ANN, respectively), are compared by application of these model to the prediction of the carcinogenic activity of a large set of drugs (735 drugs) belonging to a diverse type of compounds. A total number of 1350 theoretical descriptors are calculated for each molecule. The matrix of calculated descriptors (with 735 × 1350 dimension) is subjected to PCA. 95% of the variances in the matrix are explained by the first 137 principal components (PC's). From the pool of 137 PC's, the factor selection methods (ER, CR, and GA) are employed to select the best set of PC's for PC-ANN modeling. In the ER-PC-ANN, the PC's are successively entered into the ANN based on their decreasing eigenvalue. In the CR-PC-ANN, the ANN is first employed to model the nonlinear relationship between each one of the PC's and the carcinogen activity separately. Then, the PC's are ranked based on their decreasing correlating ability and entered to the input layer of the network one after another. Finally, a search algorithm (i.e. genetic algorithm) is used to find the best set of PC's. Both the external and cross-validation methods are used to validate the performances of the resulting models. One is able to see that the results obtained by the PC-GA-ANN and CR-PC-ANN procedures are superior to those resulted from the EV-PC-ANN. Comparison of the results reveals that the results produced by the PC-GA-ANN algorithm are better than those produced by CR-PC-ANN. However, the difference is not significant.</abstract><cop>United States</cop><pub>American Chemical Society</pub><pmid>15667145</pmid><doi>10.1021/ci049766z</doi><tpages>10</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1549-9596 |
ispartof | Journal of chemical information and modeling, 2005-01, Vol.45 (1), p.190-199 |
issn | 1549-9596 1549-960X |
language | eng |
recordid | cdi_proquest_journals_216243317 |
source | MEDLINE; ACS Publications |
subjects | Algorithms Bioinformatics Carcinogens Carcinogens - chemistry Carcinogens - toxicity Comparative analysis Computer based modeling Models, Chemical Neural Networks (Computer) Pharmaceuticals Principal Component Analysis Quantitative Structure-Activity Relationship |
title | Toward an Optimal Procedure for PC-ANN Model Building: Prediction of the Carcinogenic Activity of a Large Set of Drugs |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T21%3A35%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Toward%20an%20Optimal%20Procedure%20for%20PC-ANN%20Model%20Building:%E2%80%89%20Prediction%20of%20the%20Carcinogenic%20Activity%20of%20a%20Large%20Set%20of%20Drugs&rft.jtitle=Journal%20of%20chemical%20information%20and%20modeling&rft.au=Hemmateenejad,%20Bahram&rft.date=2005-01-01&rft.volume=45&rft.issue=1&rft.spage=190&rft.epage=199&rft.pages=190-199&rft.issn=1549-9596&rft.eissn=1549-960X&rft_id=info:doi/10.1021/ci049766z&rft_dat=%3Cproquest_cross%3E854039561%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=216243317&rft_id=info:pmid/15667145&rfr_iscdi=true |