Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA)

The EVA structural descriptor, based upon calculated fundamental molecular vibrational frequencies, has proved to be an effective descriptor for both QSAR and database similarity calculations. The descriptor is sensitive to 3D structure but has an advantage over field-based 3D-QSAR methods inasmuch...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of computer-aided molecular design 2000-01, Vol.14 (1), p.1-21
Hauptverfasser: Turner, D B, Willett, P
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 21
container_issue 1
container_start_page 1
container_title Journal of computer-aided molecular design
container_volume 14
creator Turner, D B
Willett, P
description The EVA structural descriptor, based upon calculated fundamental molecular vibrational frequencies, has proved to be an effective descriptor for both QSAR and database similarity calculations. The descriptor is sensitive to 3D structure but has an advantage over field-based 3D-QSAR methods inasmuch as structural superposition is not required. The original technique involves a standardisation method wherein uniform Gaussians of fixed standard deviation (sigma) are used to smear out frequencies projected onto a linear scale. The smearing function permits the overlap of proximal frequencies and thence the extraction of a fixed dimensional descriptor regardless of the number and precise values of the frequencies. It is proposed here that there exist optimal localised values of sigma in different spectral regions; that is, the overlap of frequencies using uniform Gaussians may, at certain points in the spectrum, either be insufficient to pick up relationships where they exist or mix up information to such an extent that significant correlations are obscured by noise. A genetic algorithm is used to search for optimal localised sigma values using crossvalidated PLS regression scores as the fitness score to be optimised. The resultant models were then validated against a previously unseen test set of compounds and through data scrambling. The performance of EVA_GA is compared to that of EVA and analogous CoMFA studies; in the latter case a brief evaluation is made of the effect of grid resolution upon the stability of CoMFA PLS scores particularly in relation to test set predictions.
doi_str_mv 10.1023/A:1008180020974
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_proquest_miscellaneous_70946756</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2101860921</sourcerecordid><originalsourceid>FETCH-LOGICAL-c321t-c6dd589bf845b471397adae79de165e79513bce0e1d1736e7645355c35a9659b3</originalsourceid><addsrcrecordid>eNpdkEFv1DAQha0K1G5Lz70hiwOCQ8o4ju24t6jaFqRKCNqi3iLHnu26SuLFdor4I_xeTCkXDqM30nxv5mkIOWFwyqDmH7ozBtCyFqAGrZo9smJC8arRgr0gK9A1VFI0dwfkMKUHAFBawj45YKCg1nW9Ir_Wj2ZcTPZhpmFD8xbp-ltHHSYb_S6HSDelvlx3X2nKi_OYzig_pTeFWxL-sRh6jzNmb6kZ70P0eTvRHGhCE-32yT0Fh2OiP8qI4rw1s0VHdxGdt9k_YmnDDmMuu-m7cry_7N6_Ii83Zkx4_KxH5PZifXP-sbr6fPnpvLuqLK9Zrqx0TrR62LSNGBrFuFbGGVTaIZOiqGB8sAjIHFNcopKN4EJYLoyWQg_8iLz9u7dk-L5gyv3kk8VxNDOGJfUKdCOVkAV88x_4EJY4l2y94koo1TasQK-foWWY0PW76CcTf_b_3s1_A-TpgN4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>737577841</pqid></control><display><type>article</type><title>Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA)</title><source>MEDLINE</source><source>SpringerNature Journals</source><creator>Turner, D B ; Willett, P</creator><creatorcontrib>Turner, D B ; Willett, P</creatorcontrib><description>The EVA structural descriptor, based upon calculated fundamental molecular vibrational frequencies, has proved to be an effective descriptor for both QSAR and database similarity calculations. The descriptor is sensitive to 3D structure but has an advantage over field-based 3D-QSAR methods inasmuch as structural superposition is not required. The original technique involves a standardisation method wherein uniform Gaussians of fixed standard deviation (sigma) are used to smear out frequencies projected onto a linear scale. The smearing function permits the overlap of proximal frequencies and thence the extraction of a fixed dimensional descriptor regardless of the number and precise values of the frequencies. It is proposed here that there exist optimal localised values of sigma in different spectral regions; that is, the overlap of frequencies using uniform Gaussians may, at certain points in the spectrum, either be insufficient to pick up relationships where they exist or mix up information to such an extent that significant correlations are obscured by noise. A genetic algorithm is used to search for optimal localised sigma values using crossvalidated PLS regression scores as the fitness score to be optimised. The resultant models were then validated against a previously unseen test set of compounds and through data scrambling. The performance of EVA_GA is compared to that of EVA and analogous CoMFA studies; in the latter case a brief evaluation is made of the effect of grid resolution upon the stability of CoMFA PLS scores particularly in relation to test set predictions.</description><identifier>ISSN: 0920-654X</identifier><identifier>EISSN: 1573-4951</identifier><identifier>DOI: 10.1023/A:1008180020974</identifier><identifier>PMID: 10702922</identifier><language>eng</language><publisher>Netherlands: Springer Nature B.V</publisher><subject>Algorithms ; Databases, Factual ; Drug Design ; Genetic algorithms ; Ligands ; Models, Genetic ; Receptors, Cell Surface - metabolism ; Receptors, Cytoplasmic and Nuclear - metabolism ; Receptors, Melatonin ; Software ; Standard deviation ; Structure-Activity Relationship ; Studies ; Transcortin - metabolism</subject><ispartof>Journal of computer-aided molecular design, 2000-01, Vol.14 (1), p.1-21</ispartof><rights>Kluwer Academic Publishers 2000</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c321t-c6dd589bf845b471397adae79de165e79513bce0e1d1736e7645355c35a9659b3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,781,785,27926,27927</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/10702922$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Turner, D B</creatorcontrib><creatorcontrib>Willett, P</creatorcontrib><title>Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA)</title><title>Journal of computer-aided molecular design</title><addtitle>J Comput Aided Mol Des</addtitle><description>The EVA structural descriptor, based upon calculated fundamental molecular vibrational frequencies, has proved to be an effective descriptor for both QSAR and database similarity calculations. The descriptor is sensitive to 3D structure but has an advantage over field-based 3D-QSAR methods inasmuch as structural superposition is not required. The original technique involves a standardisation method wherein uniform Gaussians of fixed standard deviation (sigma) are used to smear out frequencies projected onto a linear scale. The smearing function permits the overlap of proximal frequencies and thence the extraction of a fixed dimensional descriptor regardless of the number and precise values of the frequencies. It is proposed here that there exist optimal localised values of sigma in different spectral regions; that is, the overlap of frequencies using uniform Gaussians may, at certain points in the spectrum, either be insufficient to pick up relationships where they exist or mix up information to such an extent that significant correlations are obscured by noise. A genetic algorithm is used to search for optimal localised sigma values using crossvalidated PLS regression scores as the fitness score to be optimised. The resultant models were then validated against a previously unseen test set of compounds and through data scrambling. The performance of EVA_GA is compared to that of EVA and analogous CoMFA studies; in the latter case a brief evaluation is made of the effect of grid resolution upon the stability of CoMFA PLS scores particularly in relation to test set predictions.</description><subject>Algorithms</subject><subject>Databases, Factual</subject><subject>Drug Design</subject><subject>Genetic algorithms</subject><subject>Ligands</subject><subject>Models, Genetic</subject><subject>Receptors, Cell Surface - metabolism</subject><subject>Receptors, Cytoplasmic and Nuclear - metabolism</subject><subject>Receptors, Melatonin</subject><subject>Software</subject><subject>Standard deviation</subject><subject>Structure-Activity Relationship</subject><subject>Studies</subject><subject>Transcortin - metabolism</subject><issn>0920-654X</issn><issn>1573-4951</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2000</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNpdkEFv1DAQha0K1G5Lz70hiwOCQ8o4ju24t6jaFqRKCNqi3iLHnu26SuLFdor4I_xeTCkXDqM30nxv5mkIOWFwyqDmH7ozBtCyFqAGrZo9smJC8arRgr0gK9A1VFI0dwfkMKUHAFBawj45YKCg1nW9Ir_Wj2ZcTPZhpmFD8xbp-ltHHSYb_S6HSDelvlx3X2nKi_OYzig_pTeFWxL-sRh6jzNmb6kZ70P0eTvRHGhCE-32yT0Fh2OiP8qI4rw1s0VHdxGdt9k_YmnDDmMuu-m7cry_7N6_Ii83Zkx4_KxH5PZifXP-sbr6fPnpvLuqLK9Zrqx0TrR62LSNGBrFuFbGGVTaIZOiqGB8sAjIHFNcopKN4EJYLoyWQg_8iLz9u7dk-L5gyv3kk8VxNDOGJfUKdCOVkAV88x_4EJY4l2y94koo1TasQK-foWWY0PW76CcTf_b_3s1_A-TpgN4</recordid><startdate>200001</startdate><enddate>200001</enddate><creator>Turner, D B</creator><creator>Willett, P</creator><general>Springer Nature B.V</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>3V.</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>88I</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>BKSAR</scope><scope>CCPQU</scope><scope>D1I</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>KB.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>P5Z</scope><scope>P62</scope><scope>PCBAR</scope><scope>PDBOC</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7X8</scope></search><sort><creationdate>200001</creationdate><title>Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA)</title><author>Turner, D B ; Willett, P</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c321t-c6dd589bf845b471397adae79de165e79513bce0e1d1736e7645355c35a9659b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2000</creationdate><topic>Algorithms</topic><topic>Databases, Factual</topic><topic>Drug Design</topic><topic>Genetic algorithms</topic><topic>Ligands</topic><topic>Models, Genetic</topic><topic>Receptors, Cell Surface - metabolism</topic><topic>Receptors, Cytoplasmic and Nuclear - metabolism</topic><topic>Receptors, Melatonin</topic><topic>Software</topic><topic>Standard deviation</topic><topic>Structure-Activity Relationship</topic><topic>Studies</topic><topic>Transcortin - metabolism</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Turner, D B</creatorcontrib><creatorcontrib>Willett, P</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>Earth, Atmospheric &amp; Aquatic Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Materials Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Materials Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Earth, Atmospheric &amp; Aquatic Science Database</collection><collection>Materials Science Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of computer-aided molecular design</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Turner, D B</au><au>Willett, P</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA)</atitle><jtitle>Journal of computer-aided molecular design</jtitle><addtitle>J Comput Aided Mol Des</addtitle><date>2000-01</date><risdate>2000</risdate><volume>14</volume><issue>1</issue><spage>1</spage><epage>21</epage><pages>1-21</pages><issn>0920-654X</issn><eissn>1573-4951</eissn><abstract>The EVA structural descriptor, based upon calculated fundamental molecular vibrational frequencies, has proved to be an effective descriptor for both QSAR and database similarity calculations. The descriptor is sensitive to 3D structure but has an advantage over field-based 3D-QSAR methods inasmuch as structural superposition is not required. The original technique involves a standardisation method wherein uniform Gaussians of fixed standard deviation (sigma) are used to smear out frequencies projected onto a linear scale. The smearing function permits the overlap of proximal frequencies and thence the extraction of a fixed dimensional descriptor regardless of the number and precise values of the frequencies. It is proposed here that there exist optimal localised values of sigma in different spectral regions; that is, the overlap of frequencies using uniform Gaussians may, at certain points in the spectrum, either be insufficient to pick up relationships where they exist or mix up information to such an extent that significant correlations are obscured by noise. A genetic algorithm is used to search for optimal localised sigma values using crossvalidated PLS regression scores as the fitness score to be optimised. The resultant models were then validated against a previously unseen test set of compounds and through data scrambling. The performance of EVA_GA is compared to that of EVA and analogous CoMFA studies; in the latter case a brief evaluation is made of the effect of grid resolution upon the stability of CoMFA PLS scores particularly in relation to test set predictions.</abstract><cop>Netherlands</cop><pub>Springer Nature B.V</pub><pmid>10702922</pmid><doi>10.1023/A:1008180020974</doi><tpages>21</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0920-654X
ispartof Journal of computer-aided molecular design, 2000-01, Vol.14 (1), p.1-21
issn 0920-654X
1573-4951
language eng
recordid cdi_proquest_miscellaneous_70946756
source MEDLINE; SpringerNature Journals
subjects Algorithms
Databases, Factual
Drug Design
Genetic algorithms
Ligands
Models, Genetic
Receptors, Cell Surface - metabolism
Receptors, Cytoplasmic and Nuclear - metabolism
Receptors, Melatonin
Software
Standard deviation
Structure-Activity Relationship
Studies
Transcortin - metabolism
title Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA)
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-18T07%3A45%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Evaluation%20of%20the%20EVA%20descriptor%20for%20QSAR%20studies:%203.%20The%20use%20of%20a%20genetic%20algorithm%20to%20search%20for%20models%20with%20enhanced%20predictive%20properties%20(EVA_GA)&rft.jtitle=Journal%20of%20computer-aided%20molecular%20design&rft.au=Turner,%20D%20B&rft.date=2000-01&rft.volume=14&rft.issue=1&rft.spage=1&rft.epage=21&rft.pages=1-21&rft.issn=0920-654X&rft.eissn=1573-4951&rft_id=info:doi/10.1023/A:1008180020974&rft_dat=%3Cproquest_pubme%3E2101860921%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=737577841&rft_id=info:pmid/10702922&rfr_iscdi=true