Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA)
The EVA structural descriptor, based upon calculated fundamental molecular vibrational frequencies, has proved to be an effective descriptor for both QSAR and database similarity calculations. The descriptor is sensitive to 3D structure but has an advantage over field-based 3D-QSAR methods inasmuch...
Gespeichert in:
Veröffentlicht in: | Journal of computer-aided molecular design 2000-01, Vol.14 (1), p.1-21 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 21 |
---|---|
container_issue | 1 |
container_start_page | 1 |
container_title | Journal of computer-aided molecular design |
container_volume | 14 |
creator | Turner, D B Willett, P |
description | The EVA structural descriptor, based upon calculated fundamental molecular vibrational frequencies, has proved to be an effective descriptor for both QSAR and database similarity calculations. The descriptor is sensitive to 3D structure but has an advantage over field-based 3D-QSAR methods inasmuch as structural superposition is not required. The original technique involves a standardisation method wherein uniform Gaussians of fixed standard deviation (sigma) are used to smear out frequencies projected onto a linear scale. The smearing function permits the overlap of proximal frequencies and thence the extraction of a fixed dimensional descriptor regardless of the number and precise values of the frequencies. It is proposed here that there exist optimal localised values of sigma in different spectral regions; that is, the overlap of frequencies using uniform Gaussians may, at certain points in the spectrum, either be insufficient to pick up relationships where they exist or mix up information to such an extent that significant correlations are obscured by noise. A genetic algorithm is used to search for optimal localised sigma values using crossvalidated PLS regression scores as the fitness score to be optimised. The resultant models were then validated against a previously unseen test set of compounds and through data scrambling. The performance of EVA_GA is compared to that of EVA and analogous CoMFA studies; in the latter case a brief evaluation is made of the effect of grid resolution upon the stability of CoMFA PLS scores particularly in relation to test set predictions. |
doi_str_mv | 10.1023/A:1008180020974 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_proquest_miscellaneous_70946756</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2101860921</sourcerecordid><originalsourceid>FETCH-LOGICAL-c321t-c6dd589bf845b471397adae79de165e79513bce0e1d1736e7645355c35a9659b3</originalsourceid><addsrcrecordid>eNpdkEFv1DAQha0K1G5Lz70hiwOCQ8o4ju24t6jaFqRKCNqi3iLHnu26SuLFdor4I_xeTCkXDqM30nxv5mkIOWFwyqDmH7ozBtCyFqAGrZo9smJC8arRgr0gK9A1VFI0dwfkMKUHAFBawj45YKCg1nW9Ir_Wj2ZcTPZhpmFD8xbp-ltHHSYb_S6HSDelvlx3X2nKi_OYzig_pTeFWxL-sRh6jzNmb6kZ70P0eTvRHGhCE-32yT0Fh2OiP8qI4rw1s0VHdxGdt9k_YmnDDmMuu-m7cry_7N6_Ii83Zkx4_KxH5PZifXP-sbr6fPnpvLuqLK9Zrqx0TrR62LSNGBrFuFbGGVTaIZOiqGB8sAjIHFNcopKN4EJYLoyWQg_8iLz9u7dk-L5gyv3kk8VxNDOGJfUKdCOVkAV88x_4EJY4l2y94koo1TasQK-foWWY0PW76CcTf_b_3s1_A-TpgN4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>737577841</pqid></control><display><type>article</type><title>Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA)</title><source>MEDLINE</source><source>SpringerNature Journals</source><creator>Turner, D B ; Willett, P</creator><creatorcontrib>Turner, D B ; Willett, P</creatorcontrib><description>The EVA structural descriptor, based upon calculated fundamental molecular vibrational frequencies, has proved to be an effective descriptor for both QSAR and database similarity calculations. The descriptor is sensitive to 3D structure but has an advantage over field-based 3D-QSAR methods inasmuch as structural superposition is not required. The original technique involves a standardisation method wherein uniform Gaussians of fixed standard deviation (sigma) are used to smear out frequencies projected onto a linear scale. The smearing function permits the overlap of proximal frequencies and thence the extraction of a fixed dimensional descriptor regardless of the number and precise values of the frequencies. It is proposed here that there exist optimal localised values of sigma in different spectral regions; that is, the overlap of frequencies using uniform Gaussians may, at certain points in the spectrum, either be insufficient to pick up relationships where they exist or mix up information to such an extent that significant correlations are obscured by noise. A genetic algorithm is used to search for optimal localised sigma values using crossvalidated PLS regression scores as the fitness score to be optimised. The resultant models were then validated against a previously unseen test set of compounds and through data scrambling. The performance of EVA_GA is compared to that of EVA and analogous CoMFA studies; in the latter case a brief evaluation is made of the effect of grid resolution upon the stability of CoMFA PLS scores particularly in relation to test set predictions.</description><identifier>ISSN: 0920-654X</identifier><identifier>EISSN: 1573-4951</identifier><identifier>DOI: 10.1023/A:1008180020974</identifier><identifier>PMID: 10702922</identifier><language>eng</language><publisher>Netherlands: Springer Nature B.V</publisher><subject>Algorithms ; Databases, Factual ; Drug Design ; Genetic algorithms ; Ligands ; Models, Genetic ; Receptors, Cell Surface - metabolism ; Receptors, Cytoplasmic and Nuclear - metabolism ; Receptors, Melatonin ; Software ; Standard deviation ; Structure-Activity Relationship ; Studies ; Transcortin - metabolism</subject><ispartof>Journal of computer-aided molecular design, 2000-01, Vol.14 (1), p.1-21</ispartof><rights>Kluwer Academic Publishers 2000</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c321t-c6dd589bf845b471397adae79de165e79513bce0e1d1736e7645355c35a9659b3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,781,785,27926,27927</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/10702922$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Turner, D B</creatorcontrib><creatorcontrib>Willett, P</creatorcontrib><title>Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA)</title><title>Journal of computer-aided molecular design</title><addtitle>J Comput Aided Mol Des</addtitle><description>The EVA structural descriptor, based upon calculated fundamental molecular vibrational frequencies, has proved to be an effective descriptor for both QSAR and database similarity calculations. The descriptor is sensitive to 3D structure but has an advantage over field-based 3D-QSAR methods inasmuch as structural superposition is not required. The original technique involves a standardisation method wherein uniform Gaussians of fixed standard deviation (sigma) are used to smear out frequencies projected onto a linear scale. The smearing function permits the overlap of proximal frequencies and thence the extraction of a fixed dimensional descriptor regardless of the number and precise values of the frequencies. It is proposed here that there exist optimal localised values of sigma in different spectral regions; that is, the overlap of frequencies using uniform Gaussians may, at certain points in the spectrum, either be insufficient to pick up relationships where they exist or mix up information to such an extent that significant correlations are obscured by noise. A genetic algorithm is used to search for optimal localised sigma values using crossvalidated PLS regression scores as the fitness score to be optimised. The resultant models were then validated against a previously unseen test set of compounds and through data scrambling. The performance of EVA_GA is compared to that of EVA and analogous CoMFA studies; in the latter case a brief evaluation is made of the effect of grid resolution upon the stability of CoMFA PLS scores particularly in relation to test set predictions.</description><subject>Algorithms</subject><subject>Databases, Factual</subject><subject>Drug Design</subject><subject>Genetic algorithms</subject><subject>Ligands</subject><subject>Models, Genetic</subject><subject>Receptors, Cell Surface - metabolism</subject><subject>Receptors, Cytoplasmic and Nuclear - metabolism</subject><subject>Receptors, Melatonin</subject><subject>Software</subject><subject>Standard deviation</subject><subject>Structure-Activity Relationship</subject><subject>Studies</subject><subject>Transcortin - metabolism</subject><issn>0920-654X</issn><issn>1573-4951</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2000</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNpdkEFv1DAQha0K1G5Lz70hiwOCQ8o4ju24t6jaFqRKCNqi3iLHnu26SuLFdor4I_xeTCkXDqM30nxv5mkIOWFwyqDmH7ozBtCyFqAGrZo9smJC8arRgr0gK9A1VFI0dwfkMKUHAFBawj45YKCg1nW9Ir_Wj2ZcTPZhpmFD8xbp-ltHHSYb_S6HSDelvlx3X2nKi_OYzig_pTeFWxL-sRh6jzNmb6kZ70P0eTvRHGhCE-32yT0Fh2OiP8qI4rw1s0VHdxGdt9k_YmnDDmMuu-m7cry_7N6_Ii83Zkx4_KxH5PZifXP-sbr6fPnpvLuqLK9Zrqx0TrR62LSNGBrFuFbGGVTaIZOiqGB8sAjIHFNcopKN4EJYLoyWQg_8iLz9u7dk-L5gyv3kk8VxNDOGJfUKdCOVkAV88x_4EJY4l2y94koo1TasQK-foWWY0PW76CcTf_b_3s1_A-TpgN4</recordid><startdate>200001</startdate><enddate>200001</enddate><creator>Turner, D B</creator><creator>Willett, P</creator><general>Springer Nature B.V</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>3V.</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>88I</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>BKSAR</scope><scope>CCPQU</scope><scope>D1I</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>KB.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>P5Z</scope><scope>P62</scope><scope>PCBAR</scope><scope>PDBOC</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7X8</scope></search><sort><creationdate>200001</creationdate><title>Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA)</title><author>Turner, D B ; Willett, P</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c321t-c6dd589bf845b471397adae79de165e79513bce0e1d1736e7645355c35a9659b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2000</creationdate><topic>Algorithms</topic><topic>Databases, Factual</topic><topic>Drug Design</topic><topic>Genetic algorithms</topic><topic>Ligands</topic><topic>Models, Genetic</topic><topic>Receptors, Cell Surface - metabolism</topic><topic>Receptors, Cytoplasmic and Nuclear - metabolism</topic><topic>Receptors, Melatonin</topic><topic>Software</topic><topic>Standard deviation</topic><topic>Structure-Activity Relationship</topic><topic>Studies</topic><topic>Transcortin - metabolism</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Turner, D B</creatorcontrib><creatorcontrib>Willett, P</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>Earth, Atmospheric & Aquatic Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Materials Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Materials Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Earth, Atmospheric & Aquatic Science Database</collection><collection>Materials Science Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of computer-aided molecular design</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Turner, D B</au><au>Willett, P</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA)</atitle><jtitle>Journal of computer-aided molecular design</jtitle><addtitle>J Comput Aided Mol Des</addtitle><date>2000-01</date><risdate>2000</risdate><volume>14</volume><issue>1</issue><spage>1</spage><epage>21</epage><pages>1-21</pages><issn>0920-654X</issn><eissn>1573-4951</eissn><abstract>The EVA structural descriptor, based upon calculated fundamental molecular vibrational frequencies, has proved to be an effective descriptor for both QSAR and database similarity calculations. The descriptor is sensitive to 3D structure but has an advantage over field-based 3D-QSAR methods inasmuch as structural superposition is not required. The original technique involves a standardisation method wherein uniform Gaussians of fixed standard deviation (sigma) are used to smear out frequencies projected onto a linear scale. The smearing function permits the overlap of proximal frequencies and thence the extraction of a fixed dimensional descriptor regardless of the number and precise values of the frequencies. It is proposed here that there exist optimal localised values of sigma in different spectral regions; that is, the overlap of frequencies using uniform Gaussians may, at certain points in the spectrum, either be insufficient to pick up relationships where they exist or mix up information to such an extent that significant correlations are obscured by noise. A genetic algorithm is used to search for optimal localised sigma values using crossvalidated PLS regression scores as the fitness score to be optimised. The resultant models were then validated against a previously unseen test set of compounds and through data scrambling. The performance of EVA_GA is compared to that of EVA and analogous CoMFA studies; in the latter case a brief evaluation is made of the effect of grid resolution upon the stability of CoMFA PLS scores particularly in relation to test set predictions.</abstract><cop>Netherlands</cop><pub>Springer Nature B.V</pub><pmid>10702922</pmid><doi>10.1023/A:1008180020974</doi><tpages>21</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0920-654X |
ispartof | Journal of computer-aided molecular design, 2000-01, Vol.14 (1), p.1-21 |
issn | 0920-654X 1573-4951 |
language | eng |
recordid | cdi_proquest_miscellaneous_70946756 |
source | MEDLINE; SpringerNature Journals |
subjects | Algorithms Databases, Factual Drug Design Genetic algorithms Ligands Models, Genetic Receptors, Cell Surface - metabolism Receptors, Cytoplasmic and Nuclear - metabolism Receptors, Melatonin Software Standard deviation Structure-Activity Relationship Studies Transcortin - metabolism |
title | Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA) |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-18T07%3A45%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Evaluation%20of%20the%20EVA%20descriptor%20for%20QSAR%20studies:%203.%20The%20use%20of%20a%20genetic%20algorithm%20to%20search%20for%20models%20with%20enhanced%20predictive%20properties%20(EVA_GA)&rft.jtitle=Journal%20of%20computer-aided%20molecular%20design&rft.au=Turner,%20D%20B&rft.date=2000-01&rft.volume=14&rft.issue=1&rft.spage=1&rft.epage=21&rft.pages=1-21&rft.issn=0920-654X&rft.eissn=1573-4951&rft_id=info:doi/10.1023/A:1008180020974&rft_dat=%3Cproquest_pubme%3E2101860921%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=737577841&rft_id=info:pmid/10702922&rfr_iscdi=true |