ExpertRNA: A New Framework for RNA Secondary Structure Prediction

Ribonucleic acid (RNA) is a fundamental biological molecule that is essential to all living organisms, performing a versatile array of cellular tasks. The function of many RNA molecules is strongly related to the structure it adopts. As a result, great effort is being dedicated to the design of effi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:INFORMS journal on computing 2022-09, Vol.34 (5), p.2464-2484
1. Verfasser: Liu, Menghan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2484
container_issue 5
container_start_page 2464
container_title INFORMS journal on computing
container_volume 34
creator Liu, Menghan
description Ribonucleic acid (RNA) is a fundamental biological molecule that is essential to all living organisms, performing a versatile array of cellular tasks. The function of many RNA molecules is strongly related to the structure it adopts. As a result, great effort is being dedicated to the design of efficient algorithms that solve the “folding problem”—given a sequence of nucleotides, return a probable list of base pairs, referred to as the secondary structure prediction. Early algorithms largely rely on finding the structure with minimum free energy. However, the predictions rely on effective simplified free energy models that may not correctly identify the correct structure as the one with the lowest free energy. In light of this, new, data-driven approaches that not only consider free energy, but also use machine learning techniques to learn motifs are also investigated and recently been shown to outperform free energy–based algorithms on several experimental data sets. In this work, we introduce the new ExpertRNA algorithm that provides a modular framework that can easily incorporate an arbitrary number of rewards (free energy or nonparametric/data driven) and secondary structure prediction algorithms. We argue that this capability of ExpertRNA has the potential to balance out different strengths and weaknesses of state-of-the-art folding tools. We test ExpertRNA on several RNA sequence-structure data sets, and we compare the performance of ExpertRNA against a state-of-the-art folding algorithm. We find that ExpertRNA produces, on average, more accurate predictions of nonpseudoknotted secondary structures than the structure prediction algorithm used, thus validating the promise of the approach. Summary of Contribution: ExpertRNA is a new algorithm inspired by a biological problem. It is applied to solve the problem of secondary structure prediction for RNA molecules given an input sequence. The computational contribution is given by the design of a multibranch, multiexpert rollout algorithm that enables the use of several state-of-the-art approaches as base heuristics and allowing several experts to evaluate partial candidate solutions generated, thus avoiding assuming the reward being optimized by an RNA molecule when folding. Our implementation allows for the effective use of parallel computational resources as well as to control the size of the rollout tree as the algorithm progresses. The problem of RNA secondary structure prediction is of primary impor
doi_str_mv 10.1287/ijoc.2022.1188
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2736343021</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2736343021</sourcerecordid><originalsourceid>FETCH-LOGICAL-c362t-b6d19f701c06eec12c906931b361f1cc4c582192d307230feb4700edff7293a93</originalsourceid><addsrcrecordid>eNqFkM9LwzAUgIMoOKdXzwHPre8lbdp4G2NTYUxxeg5dmkCna2bSMv3vTang0dN78L736yPkGiFFVha3zc7plAFjKWJZnpAJ5kwkec7K05iDxESWuTgnFyHsACDjmZyQ2eLrYHz3sp7d0RldmyNd-mpvjs6_U-s8jQW6Mdq1deW_6abzve56b-izN3Wju8a1l-TMVh_BXP3GKXlbLl7nD8nq6f5xPlslmgvWJVtRo7QFoAZhjEamJQjJccsFWtQ603nJULKaQ8E4WLPNCgBTW1swySvJp-RmnHvw7rM3oVM71_s2rlSs4IJnHBhGKh0p7V0I3lh18M0-3q4Q1KBJDZrUoEkNmmIDHRuGJ5vwh0dXkS8xi0gyIk0blezDfyN_AG47clk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2736343021</pqid></control><display><type>article</type><title>ExpertRNA: A New Framework for RNA Secondary Structure Prediction</title><source>Informs</source><creator>Liu, Menghan</creator><creatorcontrib>Liu, Menghan</creatorcontrib><description>Ribonucleic acid (RNA) is a fundamental biological molecule that is essential to all living organisms, performing a versatile array of cellular tasks. The function of many RNA molecules is strongly related to the structure it adopts. As a result, great effort is being dedicated to the design of efficient algorithms that solve the “folding problem”—given a sequence of nucleotides, return a probable list of base pairs, referred to as the secondary structure prediction. Early algorithms largely rely on finding the structure with minimum free energy. However, the predictions rely on effective simplified free energy models that may not correctly identify the correct structure as the one with the lowest free energy. In light of this, new, data-driven approaches that not only consider free energy, but also use machine learning techniques to learn motifs are also investigated and recently been shown to outperform free energy–based algorithms on several experimental data sets. In this work, we introduce the new ExpertRNA algorithm that provides a modular framework that can easily incorporate an arbitrary number of rewards (free energy or nonparametric/data driven) and secondary structure prediction algorithms. We argue that this capability of ExpertRNA has the potential to balance out different strengths and weaknesses of state-of-the-art folding tools. We test ExpertRNA on several RNA sequence-structure data sets, and we compare the performance of ExpertRNA against a state-of-the-art folding algorithm. We find that ExpertRNA produces, on average, more accurate predictions of nonpseudoknotted secondary structures than the structure prediction algorithm used, thus validating the promise of the approach. Summary of Contribution: ExpertRNA is a new algorithm inspired by a biological problem. It is applied to solve the problem of secondary structure prediction for RNA molecules given an input sequence. The computational contribution is given by the design of a multibranch, multiexpert rollout algorithm that enables the use of several state-of-the-art approaches as base heuristics and allowing several experts to evaluate partial candidate solutions generated, thus avoiding assuming the reward being optimized by an RNA molecule when folding. Our implementation allows for the effective use of parallel computational resources as well as to control the size of the rollout tree as the algorithm progresses. The problem of RNA secondary structure prediction is of primary importance within the biology field because the molecule structure is strongly related to its functionality. Whereas the contribution of the paper is in the algorithm, the importance of the application makes ExpertRNA a showcase of the relevance of computationally efficient algorithms in supporting scientific discovery.</description><identifier>ISSN: 1091-9856</identifier><identifier>EISSN: 1526-5528</identifier><identifier>EISSN: 1091-9856</identifier><identifier>DOI: 10.1287/ijoc.2022.1188</identifier><language>eng</language><publisher>Linthicum: INFORMS</publisher><subject>Algorithms ; applications ; biology ; computational methods ; computational science ; Datasets ; deterministic ; dynamic programming ; Energy ; Folding ; Free energy ; Genetic algorithms ; industries ; Machine learning ; Molecular biology ; Molecular structure ; Nucleotides ; pharmaceutical ; Ribonucleic acid ; RNA</subject><ispartof>INFORMS journal on computing, 2022-09, Vol.34 (5), p.2464-2484</ispartof><rights>Copyright Institute for Operations Research and the Management Sciences Sep/Oct 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c362t-b6d19f701c06eec12c906931b361f1cc4c582192d307230feb4700edff7293a93</citedby><cites>FETCH-LOGICAL-c362t-b6d19f701c06eec12c906931b361f1cc4c582192d307230feb4700edff7293a93</cites><orcidid>0000-0001-6726-9790</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubsonline.informs.org/doi/full/10.1287/ijoc.2022.1188$$EHTML$$P50$$Ginforms$$H</linktohtml><link.rule.ids>315,781,785,3693,27929,27930,62621</link.rule.ids></links><search><creatorcontrib>Liu, Menghan</creatorcontrib><title>ExpertRNA: A New Framework for RNA Secondary Structure Prediction</title><title>INFORMS journal on computing</title><description>Ribonucleic acid (RNA) is a fundamental biological molecule that is essential to all living organisms, performing a versatile array of cellular tasks. The function of many RNA molecules is strongly related to the structure it adopts. As a result, great effort is being dedicated to the design of efficient algorithms that solve the “folding problem”—given a sequence of nucleotides, return a probable list of base pairs, referred to as the secondary structure prediction. Early algorithms largely rely on finding the structure with minimum free energy. However, the predictions rely on effective simplified free energy models that may not correctly identify the correct structure as the one with the lowest free energy. In light of this, new, data-driven approaches that not only consider free energy, but also use machine learning techniques to learn motifs are also investigated and recently been shown to outperform free energy–based algorithms on several experimental data sets. In this work, we introduce the new ExpertRNA algorithm that provides a modular framework that can easily incorporate an arbitrary number of rewards (free energy or nonparametric/data driven) and secondary structure prediction algorithms. We argue that this capability of ExpertRNA has the potential to balance out different strengths and weaknesses of state-of-the-art folding tools. We test ExpertRNA on several RNA sequence-structure data sets, and we compare the performance of ExpertRNA against a state-of-the-art folding algorithm. We find that ExpertRNA produces, on average, more accurate predictions of nonpseudoknotted secondary structures than the structure prediction algorithm used, thus validating the promise of the approach. Summary of Contribution: ExpertRNA is a new algorithm inspired by a biological problem. It is applied to solve the problem of secondary structure prediction for RNA molecules given an input sequence. The computational contribution is given by the design of a multibranch, multiexpert rollout algorithm that enables the use of several state-of-the-art approaches as base heuristics and allowing several experts to evaluate partial candidate solutions generated, thus avoiding assuming the reward being optimized by an RNA molecule when folding. Our implementation allows for the effective use of parallel computational resources as well as to control the size of the rollout tree as the algorithm progresses. The problem of RNA secondary structure prediction is of primary importance within the biology field because the molecule structure is strongly related to its functionality. Whereas the contribution of the paper is in the algorithm, the importance of the application makes ExpertRNA a showcase of the relevance of computationally efficient algorithms in supporting scientific discovery.</description><subject>Algorithms</subject><subject>applications</subject><subject>biology</subject><subject>computational methods</subject><subject>computational science</subject><subject>Datasets</subject><subject>deterministic</subject><subject>dynamic programming</subject><subject>Energy</subject><subject>Folding</subject><subject>Free energy</subject><subject>Genetic algorithms</subject><subject>industries</subject><subject>Machine learning</subject><subject>Molecular biology</subject><subject>Molecular structure</subject><subject>Nucleotides</subject><subject>pharmaceutical</subject><subject>Ribonucleic acid</subject><subject>RNA</subject><issn>1091-9856</issn><issn>1526-5528</issn><issn>1091-9856</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNqFkM9LwzAUgIMoOKdXzwHPre8lbdp4G2NTYUxxeg5dmkCna2bSMv3vTang0dN78L736yPkGiFFVha3zc7plAFjKWJZnpAJ5kwkec7K05iDxESWuTgnFyHsACDjmZyQ2eLrYHz3sp7d0RldmyNd-mpvjs6_U-s8jQW6Mdq1deW_6abzve56b-izN3Wju8a1l-TMVh_BXP3GKXlbLl7nD8nq6f5xPlslmgvWJVtRo7QFoAZhjEamJQjJccsFWtQ603nJULKaQ8E4WLPNCgBTW1swySvJp-RmnHvw7rM3oVM71_s2rlSs4IJnHBhGKh0p7V0I3lh18M0-3q4Q1KBJDZrUoEkNmmIDHRuGJ5vwh0dXkS8xi0gyIk0blezDfyN_AG47clk</recordid><startdate>20220901</startdate><enddate>20220901</enddate><creator>Liu, Menghan</creator><general>INFORMS</general><general>Institute for Operations Research and the Management Sciences</general><scope>OQ6</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><orcidid>https://orcid.org/0000-0001-6726-9790</orcidid></search><sort><creationdate>20220901</creationdate><title>ExpertRNA: A New Framework for RNA Secondary Structure Prediction</title><author>Liu, Menghan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c362t-b6d19f701c06eec12c906931b361f1cc4c582192d307230feb4700edff7293a93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>applications</topic><topic>biology</topic><topic>computational methods</topic><topic>computational science</topic><topic>Datasets</topic><topic>deterministic</topic><topic>dynamic programming</topic><topic>Energy</topic><topic>Folding</topic><topic>Free energy</topic><topic>Genetic algorithms</topic><topic>industries</topic><topic>Machine learning</topic><topic>Molecular biology</topic><topic>Molecular structure</topic><topic>Nucleotides</topic><topic>pharmaceutical</topic><topic>Ribonucleic acid</topic><topic>RNA</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Menghan</creatorcontrib><collection>ECONIS</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>INFORMS journal on computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Menghan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>ExpertRNA: A New Framework for RNA Secondary Structure Prediction</atitle><jtitle>INFORMS journal on computing</jtitle><date>2022-09-01</date><risdate>2022</risdate><volume>34</volume><issue>5</issue><spage>2464</spage><epage>2484</epage><pages>2464-2484</pages><issn>1091-9856</issn><eissn>1526-5528</eissn><eissn>1091-9856</eissn><abstract>Ribonucleic acid (RNA) is a fundamental biological molecule that is essential to all living organisms, performing a versatile array of cellular tasks. The function of many RNA molecules is strongly related to the structure it adopts. As a result, great effort is being dedicated to the design of efficient algorithms that solve the “folding problem”—given a sequence of nucleotides, return a probable list of base pairs, referred to as the secondary structure prediction. Early algorithms largely rely on finding the structure with minimum free energy. However, the predictions rely on effective simplified free energy models that may not correctly identify the correct structure as the one with the lowest free energy. In light of this, new, data-driven approaches that not only consider free energy, but also use machine learning techniques to learn motifs are also investigated and recently been shown to outperform free energy–based algorithms on several experimental data sets. In this work, we introduce the new ExpertRNA algorithm that provides a modular framework that can easily incorporate an arbitrary number of rewards (free energy or nonparametric/data driven) and secondary structure prediction algorithms. We argue that this capability of ExpertRNA has the potential to balance out different strengths and weaknesses of state-of-the-art folding tools. We test ExpertRNA on several RNA sequence-structure data sets, and we compare the performance of ExpertRNA against a state-of-the-art folding algorithm. We find that ExpertRNA produces, on average, more accurate predictions of nonpseudoknotted secondary structures than the structure prediction algorithm used, thus validating the promise of the approach. Summary of Contribution: ExpertRNA is a new algorithm inspired by a biological problem. It is applied to solve the problem of secondary structure prediction for RNA molecules given an input sequence. The computational contribution is given by the design of a multibranch, multiexpert rollout algorithm that enables the use of several state-of-the-art approaches as base heuristics and allowing several experts to evaluate partial candidate solutions generated, thus avoiding assuming the reward being optimized by an RNA molecule when folding. Our implementation allows for the effective use of parallel computational resources as well as to control the size of the rollout tree as the algorithm progresses. The problem of RNA secondary structure prediction is of primary importance within the biology field because the molecule structure is strongly related to its functionality. Whereas the contribution of the paper is in the algorithm, the importance of the application makes ExpertRNA a showcase of the relevance of computationally efficient algorithms in supporting scientific discovery.</abstract><cop>Linthicum</cop><pub>INFORMS</pub><doi>10.1287/ijoc.2022.1188</doi><tpages>21</tpages><orcidid>https://orcid.org/0000-0001-6726-9790</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1091-9856
ispartof INFORMS journal on computing, 2022-09, Vol.34 (5), p.2464-2484
issn 1091-9856
1526-5528
1091-9856
language eng
recordid cdi_proquest_journals_2736343021
source Informs
subjects Algorithms
applications
biology
computational methods
computational science
Datasets
deterministic
dynamic programming
Energy
Folding
Free energy
Genetic algorithms
industries
Machine learning
Molecular biology
Molecular structure
Nucleotides
pharmaceutical
Ribonucleic acid
RNA
title ExpertRNA: A New Framework for RNA Secondary Structure Prediction
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-15T04%3A31%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=ExpertRNA:%20A%20New%20Framework%20for%20RNA%20Secondary%20Structure%20Prediction&rft.jtitle=INFORMS%20journal%20on%20computing&rft.au=Liu,%20Menghan&rft.date=2022-09-01&rft.volume=34&rft.issue=5&rft.spage=2464&rft.epage=2484&rft.pages=2464-2484&rft.issn=1091-9856&rft.eissn=1526-5528&rft_id=info:doi/10.1287/ijoc.2022.1188&rft_dat=%3Cproquest_cross%3E2736343021%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2736343021&rft_id=info:pmid/&rfr_iscdi=true