A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine

RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the pr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Scientific reports 2018-06, Vol.8 (1), p.9552-10, Article 9552
Hauptverfasser: Jain, Dharm Skandh, Gupte, Sanket Rajan, Aduri, Raviprasad
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 10
container_issue 1
container_start_page 9552
container_title Scientific reports
container_volume 8
creator Jain, Dharm Skandh
Gupte, Sanket Rajan
Aduri, Raviprasad
description RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the predictions, and our in-house experiments show that they fail to accurately predict RPI involving short RNA sequences such as TERRA RNA. Here, we present a data-driven model for RPI prediction using a gradient boosting classifier. Amino acids and nucleotides are classified based on the high-resolution structural data of RNA protein complexes. The minimum structural unit consisting of five residues is used as the descriptor. Comparative analysis of existing methods shows the consistently higher performance of our method irrespective of the length of RNA present in the RPI. The method has been successfully applied to map RPI networks involving both long noncoding RNA as well as TERRA RNA. The method is also shown to successfully predict RNA and protein hubs present in RPI networks of four different organisms. The robustness of this method will provide a way for predicting RPI networks of yet unknown interactions for both long noncoding RNA and microRNA.
doi_str_mv 10.1038/s41598-018-27814-2
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6015049</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2058507608</sourcerecordid><originalsourceid>FETCH-LOGICAL-c474t-6c988c0c22e7154f1de59b2b2f525c283bf6015e6027f9015bdaf30442a22b713</originalsourceid><addsrcrecordid>eNp9kUFPFTEUhRsjEQL8ARemiRs3o-1tO9NuTJ6gSAJKjG5tOp07j5J5LbbzSPz3lvcQwQXd9Kb3O6f35hDykrO3nAn9rkiujG4Y1w10mssGnpE9YFI1IACeP6h3yWEpV6weBUZy84LsgjFCKs72yM8FPXazo8c53GCk52nAiY4p04uMQ_BziEv67cuiuchpxhDpaZwxu_qeYqG9KzjQFOlJdkPAONMPKZWN5tz5yxDxgOyMbip4eHfvkx-fPn4_-tycfT05PVqcNV52cm5ab7T2zANgx5Uc-YDK9NDDqEB50KIfW8YVtgy60dSqH9womJTgAPqOi33yfut7ve5XOPg6S3aTvc5h5fJvm1ywjzsxXNplurG3tkyaavDmziCnX2sss12F4nGaXMS0LhaY0op1LdMVff0fepXWOdb1NhRIIURbKdhSPqdSMo73w3BmbxO02wRtTdBuErRQRa8ernEv-ZtXBcQWKLUVl5j__f2E7R_oTaVm</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2058243336</pqid></control><display><type>article</type><title>A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine</title><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Springer Nature OA Free Journals</source><source>Nature Free</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Jain, Dharm Skandh ; Gupte, Sanket Rajan ; Aduri, Raviprasad</creator><creatorcontrib>Jain, Dharm Skandh ; Gupte, Sanket Rajan ; Aduri, Raviprasad</creatorcontrib><description>RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the predictions, and our in-house experiments show that they fail to accurately predict RPI involving short RNA sequences such as TERRA RNA. Here, we present a data-driven model for RPI prediction using a gradient boosting classifier. Amino acids and nucleotides are classified based on the high-resolution structural data of RNA protein complexes. The minimum structural unit consisting of five residues is used as the descriptor. Comparative analysis of existing methods shows the consistently higher performance of our method irrespective of the length of RNA present in the RPI. The method has been successfully applied to map RPI networks involving both long noncoding RNA as well as TERRA RNA. The method is also shown to successfully predict RNA and protein hubs present in RPI networks of four different organisms. The robustness of this method will provide a way for predicting RPI networks of yet unknown interactions for both long noncoding RNA and microRNA.</description><identifier>ISSN: 2045-2322</identifier><identifier>EISSN: 2045-2322</identifier><identifier>DOI: 10.1038/s41598-018-27814-2</identifier><identifier>PMID: 29934510</identifier><language>eng</language><publisher>London: Nature Publishing Group UK</publisher><subject>631/114/1305 ; 631/114/2397 ; Amino acids ; Comparative analysis ; Computer applications ; Humanities and Social Sciences ; Methods ; miRNA ; multidisciplinary ; Nucleotides ; Protein interaction ; RNA-protein interactions ; Science ; Science (multidisciplinary)</subject><ispartof>Scientific reports, 2018-06, Vol.8 (1), p.9552-10, Article 9552</ispartof><rights>The Author(s) 2018</rights><rights>2018. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c474t-6c988c0c22e7154f1de59b2b2f525c283bf6015e6027f9015bdaf30442a22b713</citedby><cites>FETCH-LOGICAL-c474t-6c988c0c22e7154f1de59b2b2f525c283bf6015e6027f9015bdaf30442a22b713</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6015049/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6015049/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,724,777,781,861,882,27905,27906,41101,42170,51557,53772,53774</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29934510$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Jain, Dharm Skandh</creatorcontrib><creatorcontrib>Gupte, Sanket Rajan</creatorcontrib><creatorcontrib>Aduri, Raviprasad</creatorcontrib><title>A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine</title><title>Scientific reports</title><addtitle>Sci Rep</addtitle><addtitle>Sci Rep</addtitle><description>RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the predictions, and our in-house experiments show that they fail to accurately predict RPI involving short RNA sequences such as TERRA RNA. Here, we present a data-driven model for RPI prediction using a gradient boosting classifier. Amino acids and nucleotides are classified based on the high-resolution structural data of RNA protein complexes. The minimum structural unit consisting of five residues is used as the descriptor. Comparative analysis of existing methods shows the consistently higher performance of our method irrespective of the length of RNA present in the RPI. The method has been successfully applied to map RPI networks involving both long noncoding RNA as well as TERRA RNA. The method is also shown to successfully predict RNA and protein hubs present in RPI networks of four different organisms. The robustness of this method will provide a way for predicting RPI networks of yet unknown interactions for both long noncoding RNA and microRNA.</description><subject>631/114/1305</subject><subject>631/114/2397</subject><subject>Amino acids</subject><subject>Comparative analysis</subject><subject>Computer applications</subject><subject>Humanities and Social Sciences</subject><subject>Methods</subject><subject>miRNA</subject><subject>multidisciplinary</subject><subject>Nucleotides</subject><subject>Protein interaction</subject><subject>RNA-protein interactions</subject><subject>Science</subject><subject>Science (multidisciplinary)</subject><issn>2045-2322</issn><issn>2045-2322</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9kUFPFTEUhRsjEQL8ARemiRs3o-1tO9NuTJ6gSAJKjG5tOp07j5J5LbbzSPz3lvcQwQXd9Kb3O6f35hDykrO3nAn9rkiujG4Y1w10mssGnpE9YFI1IACeP6h3yWEpV6weBUZy84LsgjFCKs72yM8FPXazo8c53GCk52nAiY4p04uMQ_BziEv67cuiuchpxhDpaZwxu_qeYqG9KzjQFOlJdkPAONMPKZWN5tz5yxDxgOyMbip4eHfvkx-fPn4_-tycfT05PVqcNV52cm5ab7T2zANgx5Uc-YDK9NDDqEB50KIfW8YVtgy60dSqH9womJTgAPqOi33yfut7ve5XOPg6S3aTvc5h5fJvm1ywjzsxXNplurG3tkyaavDmziCnX2sss12F4nGaXMS0LhaY0op1LdMVff0fepXWOdb1NhRIIURbKdhSPqdSMo73w3BmbxO02wRtTdBuErRQRa8ernEv-ZtXBcQWKLUVl5j__f2E7R_oTaVm</recordid><startdate>20180622</startdate><enddate>20180622</enddate><creator>Jain, Dharm Skandh</creator><creator>Gupte, Sanket Rajan</creator><creator>Aduri, Raviprasad</creator><general>Nature Publishing Group UK</general><general>Nature Publishing Group</general><scope>C6C</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>M7P</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20180622</creationdate><title>A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine</title><author>Jain, Dharm Skandh ; Gupte, Sanket Rajan ; Aduri, Raviprasad</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c474t-6c988c0c22e7154f1de59b2b2f525c283bf6015e6027f9015bdaf30442a22b713</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>631/114/1305</topic><topic>631/114/2397</topic><topic>Amino acids</topic><topic>Comparative analysis</topic><topic>Computer applications</topic><topic>Humanities and Social Sciences</topic><topic>Methods</topic><topic>miRNA</topic><topic>multidisciplinary</topic><topic>Nucleotides</topic><topic>Protein interaction</topic><topic>RNA-protein interactions</topic><topic>Science</topic><topic>Science (multidisciplinary)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jain, Dharm Skandh</creatorcontrib><creatorcontrib>Gupte, Sanket Rajan</creatorcontrib><creatorcontrib>Aduri, Raviprasad</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Science Database</collection><collection>Biological Science Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Scientific reports</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jain, Dharm Skandh</au><au>Gupte, Sanket Rajan</au><au>Aduri, Raviprasad</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine</atitle><jtitle>Scientific reports</jtitle><stitle>Sci Rep</stitle><addtitle>Sci Rep</addtitle><date>2018-06-22</date><risdate>2018</risdate><volume>8</volume><issue>1</issue><spage>9552</spage><epage>10</epage><pages>9552-10</pages><artnum>9552</artnum><issn>2045-2322</issn><eissn>2045-2322</eissn><abstract>RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the predictions, and our in-house experiments show that they fail to accurately predict RPI involving short RNA sequences such as TERRA RNA. Here, we present a data-driven model for RPI prediction using a gradient boosting classifier. Amino acids and nucleotides are classified based on the high-resolution structural data of RNA protein complexes. The minimum structural unit consisting of five residues is used as the descriptor. Comparative analysis of existing methods shows the consistently higher performance of our method irrespective of the length of RNA present in the RPI. The method has been successfully applied to map RPI networks involving both long noncoding RNA as well as TERRA RNA. The method is also shown to successfully predict RNA and protein hubs present in RPI networks of four different organisms. The robustness of this method will provide a way for predicting RPI networks of yet unknown interactions for both long noncoding RNA and microRNA.</abstract><cop>London</cop><pub>Nature Publishing Group UK</pub><pmid>29934510</pmid><doi>10.1038/s41598-018-27814-2</doi><tpages>10</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2045-2322
ispartof Scientific reports, 2018-06, Vol.8 (1), p.9552-10, Article 9552
issn 2045-2322
2045-2322
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6015049
source DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Springer Nature OA Free Journals; Nature Free; PubMed Central; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects 631/114/1305
631/114/2397
Amino acids
Comparative analysis
Computer applications
Humanities and Social Sciences
Methods
miRNA
multidisciplinary
Nucleotides
Protein interaction
RNA-protein interactions
Science
Science (multidisciplinary)
title A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T22%3A31%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Data%20Driven%20Model%20for%20Predicting%20RNA-Protein%20Interactions%20based%20on%20Gradient%20Boosting%20Machine&rft.jtitle=Scientific%20reports&rft.au=Jain,%20Dharm%20Skandh&rft.date=2018-06-22&rft.volume=8&rft.issue=1&rft.spage=9552&rft.epage=10&rft.pages=9552-10&rft.artnum=9552&rft.issn=2045-2322&rft.eissn=2045-2322&rft_id=info:doi/10.1038/s41598-018-27814-2&rft_dat=%3Cproquest_pubme%3E2058507608%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2058243336&rft_id=info:pmid/29934510&rfr_iscdi=true