Nucleic acid sequence design via efficient ensemble defect optimization

We describe an algorithm for designing the sequence of one or more interacting nucleic acid strands intended to adopt a target secondary structure at equilibrium. Sequence design is formulated as an optimization problem with the goal of reducing the ensemble defect below a user‐specified stop condit...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of computational chemistry 2011-02, Vol.32 (3), p.439-452
Hauptverfasser: Zadeh, Joseph N., Wolfe, Brian R., Pierce, Niles A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 452
container_issue 3
container_start_page 439
container_title Journal of computational chemistry
container_volume 32
creator Zadeh, Joseph N.
Wolfe, Brian R.
Pierce, Niles A.
description We describe an algorithm for designing the sequence of one or more interacting nucleic acid strands intended to adopt a target secondary structure at equilibrium. Sequence design is formulated as an optimization problem with the goal of reducing the ensemble defect below a user‐specified stop condition. For a candidate sequence and a given target secondary structure, the ensemble defect is the average number of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of unpseudoknotted secondary structures. To reduce the computational cost of accepting or rejecting mutations to a random initial sequence, candidate mutations are evaluated on the leaf nodes of a tree‐decomposition of the target structure. During leaf optimization, defect‐weighted mutation sampling is used to select each candidate mutation position with probability proportional to its contribution to the ensemble defect of the leaf. As subsequences are merged moving up the tree, emergent structural defects resulting from crosstalk between sibling sequences are eliminated via reoptimization within the defective subtree starting from new random subsequences. Using a Θ(N3) dynamic program to evaluate the ensemble defect of a target structure with N nucleotides, this hierarchical approach implies an asymptotic optimality bound on design time: for sufficiently large N, the cost of sequence design is bounded below by 4/3 the cost of a single evaluation of the ensemble defect for the full sequence. Hence, the design algorithm has time complexity Ω(N3). For target structures containing N ∈{100,200,400,800,1600,3200} nucleotides and duplex stems ranging from 1 to 30 base pairs, RNA sequence designs at 37°C typically succeed in satisfying a stop condition with ensemble defect less than N/100. Empirically, the sequence design algorithm exhibits asymptotic optimality and the exponent in the time complexity bound is sharp. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2011
doi_str_mv 10.1002/jcc.21633
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_820789133</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>820789133</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4553-f6183183c9e458b6ad3046dc8f02faf46a416b8e740351395808f06884d392cc3</originalsourceid><addsrcrecordid>eNp1kEtLxDAUhYMoOo4u_ANS3IiLah5tmiy16KgMI4qiu5BJbyRjH2PTquOvNzrqQhAu3MX9zuHcg9AOwYcEY3o0M-aQEs7YChoQLHksRfawigaYSBoLnpINtOn9DGPMUp6sow2KM5JJnA7QaNKbEpyJtHFF5OG5h9pAVIB3j3X04nQE1jrjoO4iqD1U0_LzasF0UTPvXOXedeeaegutWV162P7eQ3R3dnqbn8fjq9FFfjyOTZKmLLacCBbGSEhSMeW6YDjhhREWU6ttwnVC-FRAloSohMlU4HDiQiQFk9QYNkT7S99524SsvlOV8wbKUtfQ9F6J8JqQhLFA7v0hZ03f1iFcgAiVFFMRoIMlZNrG-xasmreu0u1CEaw-u1WhW_XVbWB3vw37aQXFL_lTZgCOlsCrK2Hxv5O6zPMfy3ipcL6Dt1-Fbp8Uz1iWqvvJSMnr64kc35-oG_YBWLaQSw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>821292028</pqid></control><display><type>article</type><title>Nucleic acid sequence design via efficient ensemble defect optimization</title><source>MEDLINE</source><source>Wiley Online Library All Journals</source><creator>Zadeh, Joseph N. ; Wolfe, Brian R. ; Pierce, Niles A.</creator><creatorcontrib>Zadeh, Joseph N. ; Wolfe, Brian R. ; Pierce, Niles A.</creatorcontrib><description>We describe an algorithm for designing the sequence of one or more interacting nucleic acid strands intended to adopt a target secondary structure at equilibrium. Sequence design is formulated as an optimization problem with the goal of reducing the ensemble defect below a user‐specified stop condition. For a candidate sequence and a given target secondary structure, the ensemble defect is the average number of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of unpseudoknotted secondary structures. To reduce the computational cost of accepting or rejecting mutations to a random initial sequence, candidate mutations are evaluated on the leaf nodes of a tree‐decomposition of the target structure. During leaf optimization, defect‐weighted mutation sampling is used to select each candidate mutation position with probability proportional to its contribution to the ensemble defect of the leaf. As subsequences are merged moving up the tree, emergent structural defects resulting from crosstalk between sibling sequences are eliminated via reoptimization within the defective subtree starting from new random subsequences. Using a Θ(N3) dynamic program to evaluate the ensemble defect of a target structure with N nucleotides, this hierarchical approach implies an asymptotic optimality bound on design time: for sufficiently large N, the cost of sequence design is bounded below by 4/3 the cost of a single evaluation of the ensemble defect for the full sequence. Hence, the design algorithm has time complexity Ω(N3). For target structures containing N ∈{100,200,400,800,1600,3200} nucleotides and duplex stems ranging from 1 to 30 base pairs, RNA sequence designs at 37°C typically succeed in satisfying a stop condition with ensemble defect less than N/100. Empirically, the sequence design algorithm exhibits asymptotic optimality and the exponent in the time complexity bound is sharp. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2011</description><identifier>ISSN: 0192-8651</identifier><identifier>EISSN: 1096-987X</identifier><identifier>DOI: 10.1002/jcc.21633</identifier><identifier>PMID: 20717905</identifier><identifier>CODEN: JCCHDD</identifier><language>eng</language><publisher>Hoboken: Wiley Subscription Services, Inc., A Wiley Company</publisher><subject>Algorithms ; Base Sequence ; Defects ; Design optimization ; DNA ; DNA - chemistry ; Dynamic programming ; Molecular Sequence Data ; Molecular structure ; Mutation ; Nucleic Acid Conformation ; Optimization algorithms ; partition function ; Probability ; Ribonucleic acid ; RNA ; RNA - chemistry ; secondary structure ; sequence design</subject><ispartof>Journal of computational chemistry, 2011-02, Vol.32 (3), p.439-452</ispartof><rights>Copyright © 2010 Wiley Periodicals, Inc.</rights><rights>Copyright John Wiley and Sons, Limited Feb 2011</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4553-f6183183c9e458b6ad3046dc8f02faf46a416b8e740351395808f06884d392cc3</citedby><cites>FETCH-LOGICAL-c4553-f6183183c9e458b6ad3046dc8f02faf46a416b8e740351395808f06884d392cc3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fjcc.21633$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fjcc.21633$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,780,784,1417,27924,27925,45574,45575</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/20717905$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zadeh, Joseph N.</creatorcontrib><creatorcontrib>Wolfe, Brian R.</creatorcontrib><creatorcontrib>Pierce, Niles A.</creatorcontrib><title>Nucleic acid sequence design via efficient ensemble defect optimization</title><title>Journal of computational chemistry</title><addtitle>J. Comput. Chem</addtitle><description>We describe an algorithm for designing the sequence of one or more interacting nucleic acid strands intended to adopt a target secondary structure at equilibrium. Sequence design is formulated as an optimization problem with the goal of reducing the ensemble defect below a user‐specified stop condition. For a candidate sequence and a given target secondary structure, the ensemble defect is the average number of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of unpseudoknotted secondary structures. To reduce the computational cost of accepting or rejecting mutations to a random initial sequence, candidate mutations are evaluated on the leaf nodes of a tree‐decomposition of the target structure. During leaf optimization, defect‐weighted mutation sampling is used to select each candidate mutation position with probability proportional to its contribution to the ensemble defect of the leaf. As subsequences are merged moving up the tree, emergent structural defects resulting from crosstalk between sibling sequences are eliminated via reoptimization within the defective subtree starting from new random subsequences. Using a Θ(N3) dynamic program to evaluate the ensemble defect of a target structure with N nucleotides, this hierarchical approach implies an asymptotic optimality bound on design time: for sufficiently large N, the cost of sequence design is bounded below by 4/3 the cost of a single evaluation of the ensemble defect for the full sequence. Hence, the design algorithm has time complexity Ω(N3). For target structures containing N ∈{100,200,400,800,1600,3200} nucleotides and duplex stems ranging from 1 to 30 base pairs, RNA sequence designs at 37°C typically succeed in satisfying a stop condition with ensemble defect less than N/100. Empirically, the sequence design algorithm exhibits asymptotic optimality and the exponent in the time complexity bound is sharp. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2011</description><subject>Algorithms</subject><subject>Base Sequence</subject><subject>Defects</subject><subject>Design optimization</subject><subject>DNA</subject><subject>DNA - chemistry</subject><subject>Dynamic programming</subject><subject>Molecular Sequence Data</subject><subject>Molecular structure</subject><subject>Mutation</subject><subject>Nucleic Acid Conformation</subject><subject>Optimization algorithms</subject><subject>partition function</subject><subject>Probability</subject><subject>Ribonucleic acid</subject><subject>RNA</subject><subject>RNA - chemistry</subject><subject>secondary structure</subject><subject>sequence design</subject><issn>0192-8651</issn><issn>1096-987X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp1kEtLxDAUhYMoOo4u_ANS3IiLah5tmiy16KgMI4qiu5BJbyRjH2PTquOvNzrqQhAu3MX9zuHcg9AOwYcEY3o0M-aQEs7YChoQLHksRfawigaYSBoLnpINtOn9DGPMUp6sow2KM5JJnA7QaNKbEpyJtHFF5OG5h9pAVIB3j3X04nQE1jrjoO4iqD1U0_LzasF0UTPvXOXedeeaegutWV162P7eQ3R3dnqbn8fjq9FFfjyOTZKmLLacCBbGSEhSMeW6YDjhhREWU6ttwnVC-FRAloSohMlU4HDiQiQFk9QYNkT7S99524SsvlOV8wbKUtfQ9F6J8JqQhLFA7v0hZ03f1iFcgAiVFFMRoIMlZNrG-xasmreu0u1CEaw-u1WhW_XVbWB3vw37aQXFL_lTZgCOlsCrK2Hxv5O6zPMfy3ipcL6Dt1-Fbp8Uz1iWqvvJSMnr64kc35-oG_YBWLaQSw</recordid><startdate>201102</startdate><enddate>201102</enddate><creator>Zadeh, Joseph N.</creator><creator>Wolfe, Brian R.</creator><creator>Pierce, Niles A.</creator><general>Wiley Subscription Services, Inc., A Wiley Company</general><general>Wiley Subscription Services, Inc</general><scope>BSCLL</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><scope>7X8</scope></search><sort><creationdate>201102</creationdate><title>Nucleic acid sequence design via efficient ensemble defect optimization</title><author>Zadeh, Joseph N. ; Wolfe, Brian R. ; Pierce, Niles A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4553-f6183183c9e458b6ad3046dc8f02faf46a416b8e740351395808f06884d392cc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Algorithms</topic><topic>Base Sequence</topic><topic>Defects</topic><topic>Design optimization</topic><topic>DNA</topic><topic>DNA - chemistry</topic><topic>Dynamic programming</topic><topic>Molecular Sequence Data</topic><topic>Molecular structure</topic><topic>Mutation</topic><topic>Nucleic Acid Conformation</topic><topic>Optimization algorithms</topic><topic>partition function</topic><topic>Probability</topic><topic>Ribonucleic acid</topic><topic>RNA</topic><topic>RNA - chemistry</topic><topic>secondary structure</topic><topic>sequence design</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zadeh, Joseph N.</creatorcontrib><creatorcontrib>Wolfe, Brian R.</creatorcontrib><creatorcontrib>Pierce, Niles A.</creatorcontrib><collection>Istex</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of computational chemistry</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zadeh, Joseph N.</au><au>Wolfe, Brian R.</au><au>Pierce, Niles A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Nucleic acid sequence design via efficient ensemble defect optimization</atitle><jtitle>Journal of computational chemistry</jtitle><addtitle>J. Comput. Chem</addtitle><date>2011-02</date><risdate>2011</risdate><volume>32</volume><issue>3</issue><spage>439</spage><epage>452</epage><pages>439-452</pages><issn>0192-8651</issn><eissn>1096-987X</eissn><coden>JCCHDD</coden><abstract>We describe an algorithm for designing the sequence of one or more interacting nucleic acid strands intended to adopt a target secondary structure at equilibrium. Sequence design is formulated as an optimization problem with the goal of reducing the ensemble defect below a user‐specified stop condition. For a candidate sequence and a given target secondary structure, the ensemble defect is the average number of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of unpseudoknotted secondary structures. To reduce the computational cost of accepting or rejecting mutations to a random initial sequence, candidate mutations are evaluated on the leaf nodes of a tree‐decomposition of the target structure. During leaf optimization, defect‐weighted mutation sampling is used to select each candidate mutation position with probability proportional to its contribution to the ensemble defect of the leaf. As subsequences are merged moving up the tree, emergent structural defects resulting from crosstalk between sibling sequences are eliminated via reoptimization within the defective subtree starting from new random subsequences. Using a Θ(N3) dynamic program to evaluate the ensemble defect of a target structure with N nucleotides, this hierarchical approach implies an asymptotic optimality bound on design time: for sufficiently large N, the cost of sequence design is bounded below by 4/3 the cost of a single evaluation of the ensemble defect for the full sequence. Hence, the design algorithm has time complexity Ω(N3). For target structures containing N ∈{100,200,400,800,1600,3200} nucleotides and duplex stems ranging from 1 to 30 base pairs, RNA sequence designs at 37°C typically succeed in satisfying a stop condition with ensemble defect less than N/100. Empirically, the sequence design algorithm exhibits asymptotic optimality and the exponent in the time complexity bound is sharp. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2011</abstract><cop>Hoboken</cop><pub>Wiley Subscription Services, Inc., A Wiley Company</pub><pmid>20717905</pmid><doi>10.1002/jcc.21633</doi><tpages>14</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0192-8651
ispartof Journal of computational chemistry, 2011-02, Vol.32 (3), p.439-452
issn 0192-8651
1096-987X
language eng
recordid cdi_proquest_miscellaneous_820789133
source MEDLINE; Wiley Online Library All Journals
subjects Algorithms
Base Sequence
Defects
Design optimization
DNA
DNA - chemistry
Dynamic programming
Molecular Sequence Data
Molecular structure
Mutation
Nucleic Acid Conformation
Optimization algorithms
partition function
Probability
Ribonucleic acid
RNA
RNA - chemistry
secondary structure
sequence design
title Nucleic acid sequence design via efficient ensemble defect optimization
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T05%3A17%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Nucleic%20acid%20sequence%20design%20via%20efficient%20ensemble%20defect%20optimization&rft.jtitle=Journal%20of%20computational%20chemistry&rft.au=Zadeh,%20Joseph%20N.&rft.date=2011-02&rft.volume=32&rft.issue=3&rft.spage=439&rft.epage=452&rft.pages=439-452&rft.issn=0192-8651&rft.eissn=1096-987X&rft.coden=JCCHDD&rft_id=info:doi/10.1002/jcc.21633&rft_dat=%3Cproquest_cross%3E820789133%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=821292028&rft_id=info:pmid/20717905&rfr_iscdi=true