Predicting RNA–protein binding sites and motifs through combining local and global deep convolutional neural networks

Abstract Motivation RNA-binding proteins (RBPs) take over 5–10% of the eukaryotic proteome and play key roles in many biological processes, e.g. gene regulation. Experimental detection of RBP binding sites is still time-intensive and high-costly. Instead, computational prediction of the RBP binding...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2018-10, Vol.34 (20), p.3427-3436
Hauptverfasser: Pan, Xiaoyong, Shen, Hong-Bin
Format: Artikel
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 3436
container_issue 20
container_start_page 3427
container_title Bioinformatics
container_volume 34
creator Pan, Xiaoyong
Shen, Hong-Bin
description Abstract Motivation RNA-binding proteins (RBPs) take over 5–10% of the eukaryotic proteome and play key roles in many biological processes, e.g. gene regulation. Experimental detection of RBP binding sites is still time-intensive and high-costly. Instead, computational prediction of the RBP binding sites using patterns learned from existing annotation knowledge is a fast approach. From the biological point of view, the local structure context derived from local sequences will be recognized by specific RBPs. However, in computational modeling using deep learning, to our best knowledge, only global representations of entire RNA sequences are employed. So far, the local sequence information is ignored in the deep model construction process. Results In this study, we present a computational method iDeepE to predict RNA–protein binding sites from RNA sequences by combining global and local convolutional neural networks (CNNs). For the global CNN, we pad the RNA sequences into the same length. For the local CNN, we split a RNA sequence into multiple overlapping fixed-length subsequences, where each subsequence is a signal channel of the whole sequence. Next, we train deep CNNs for multiple subsequences and the padded sequences to learn high-level features, respectively. Finally, the outputs from local and global CNNs are combined to improve the prediction. iDeepE demonstrates a better performance over state-of-the-art methods on two large-scale datasets derived from CLIP-seq. We also find that the local CNN runs 1.8 times faster than the global CNN with comparable performance when using GPUs. Our results show that iDeepE has captured experimentally verified binding motifs. Availability and implementation https://github.com/xypan1232/iDeepE Supplementary information Supplementary data are available at Bioinformatics online.
doi_str_mv 10.1093/bioinformatics/bty364
format Article
fullrecord <record><control><sourceid>proquest_TOX</sourceid><recordid>TN_cdi_proquest_miscellaneous_2034289188</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/bty364</oup_id><sourcerecordid>2034289188</sourcerecordid><originalsourceid>FETCH-LOGICAL-c463t-415903fd8ce32f9d657d22c6847451ed30cf61edeb626eb4a515694fcd4a20d73</originalsourceid><addsrcrecordid>eNqNkMtOwzAQRS0EgvL4BFCWbEL9TrKsKl5SBQjBOkpspzUkdrEdqu74B_6QL8F9gMSO1b2aOTOjuQCcIniBYEGGtbbaNNZ1VdDCD-uwJJzugAGiHKYYsmI3esKzlOaQHIBD718gZIhSug8OcJFhnHM2AIsHp6QWQZtp8ng3-vr4nDsblDZJrY1cVb0OyieVkUlng258EmbO9tNZImwXmRXSWlG1a2Ta2jpaqdQ89s27bfugrYklo3q3lrCw7tUfg72mar062eoReL66fBrfpJP769vxaJIKyklIKWIFJI3MhSK4KSRnmcRY8JxmlCElCRQNj6pqjrmqacUQ4wVthKQVhjIjR-B8sze-9dYrH8pOe6HatjLK9r7EkFCcFyjPI8o2qHDWe6eacu50V7lliWC5yrz8m3m5yTzOnW1P9HWn5O_UT8gRgBvA9vN_7vwGWo2YAg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2034289188</pqid></control><display><type>article</type><title>Predicting RNA–protein binding sites and motifs through combining local and global deep convolutional neural networks</title><source>Oxford Journals Open Access Collection</source><creator>Pan, Xiaoyong ; Shen, Hong-Bin</creator><creatorcontrib>Pan, Xiaoyong ; Shen, Hong-Bin</creatorcontrib><description>Abstract Motivation RNA-binding proteins (RBPs) take over 5–10% of the eukaryotic proteome and play key roles in many biological processes, e.g. gene regulation. Experimental detection of RBP binding sites is still time-intensive and high-costly. Instead, computational prediction of the RBP binding sites using patterns learned from existing annotation knowledge is a fast approach. From the biological point of view, the local structure context derived from local sequences will be recognized by specific RBPs. However, in computational modeling using deep learning, to our best knowledge, only global representations of entire RNA sequences are employed. So far, the local sequence information is ignored in the deep model construction process. Results In this study, we present a computational method iDeepE to predict RNA–protein binding sites from RNA sequences by combining global and local convolutional neural networks (CNNs). For the global CNN, we pad the RNA sequences into the same length. For the local CNN, we split a RNA sequence into multiple overlapping fixed-length subsequences, where each subsequence is a signal channel of the whole sequence. Next, we train deep CNNs for multiple subsequences and the padded sequences to learn high-level features, respectively. Finally, the outputs from local and global CNNs are combined to improve the prediction. iDeepE demonstrates a better performance over state-of-the-art methods on two large-scale datasets derived from CLIP-seq. We also find that the local CNN runs 1.8 times faster than the global CNN with comparable performance when using GPUs. Our results show that iDeepE has captured experimentally verified binding motifs. Availability and implementation https://github.com/xypan1232/iDeepE Supplementary information Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/bty364</identifier><identifier>PMID: 29722865</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><ispartof>Bioinformatics, 2018-10, Vol.34 (20), p.3427-3436</ispartof><rights>The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c463t-415903fd8ce32f9d657d22c6847451ed30cf61edeb626eb4a515694fcd4a20d73</citedby><cites>FETCH-LOGICAL-c463t-415903fd8ce32f9d657d22c6847451ed30cf61edeb626eb4a515694fcd4a20d73</cites><orcidid>0000-0002-4029-3325</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,1598,27901,27902</link.rule.ids><linktorsrc>$$Uhttps://dx.doi.org/10.1093/bioinformatics/bty364$$EView_record_in_Oxford_University_Press$$FView_record_in_$$GOxford_University_Press</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29722865$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Pan, Xiaoyong</creatorcontrib><creatorcontrib>Shen, Hong-Bin</creatorcontrib><title>Predicting RNA–protein binding sites and motifs through combining local and global deep convolutional neural networks</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Abstract Motivation RNA-binding proteins (RBPs) take over 5–10% of the eukaryotic proteome and play key roles in many biological processes, e.g. gene regulation. Experimental detection of RBP binding sites is still time-intensive and high-costly. Instead, computational prediction of the RBP binding sites using patterns learned from existing annotation knowledge is a fast approach. From the biological point of view, the local structure context derived from local sequences will be recognized by specific RBPs. However, in computational modeling using deep learning, to our best knowledge, only global representations of entire RNA sequences are employed. So far, the local sequence information is ignored in the deep model construction process. Results In this study, we present a computational method iDeepE to predict RNA–protein binding sites from RNA sequences by combining global and local convolutional neural networks (CNNs). For the global CNN, we pad the RNA sequences into the same length. For the local CNN, we split a RNA sequence into multiple overlapping fixed-length subsequences, where each subsequence is a signal channel of the whole sequence. Next, we train deep CNNs for multiple subsequences and the padded sequences to learn high-level features, respectively. Finally, the outputs from local and global CNNs are combined to improve the prediction. iDeepE demonstrates a better performance over state-of-the-art methods on two large-scale datasets derived from CLIP-seq. We also find that the local CNN runs 1.8 times faster than the global CNN with comparable performance when using GPUs. Our results show that iDeepE has captured experimentally verified binding motifs. Availability and implementation https://github.com/xypan1232/iDeepE Supplementary information Supplementary data are available at Bioinformatics online.</description><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNqNkMtOwzAQRS0EgvL4BFCWbEL9TrKsKl5SBQjBOkpspzUkdrEdqu74B_6QL8F9gMSO1b2aOTOjuQCcIniBYEGGtbbaNNZ1VdDCD-uwJJzugAGiHKYYsmI3esKzlOaQHIBD718gZIhSug8OcJFhnHM2AIsHp6QWQZtp8ng3-vr4nDsblDZJrY1cVb0OyieVkUlng258EmbO9tNZImwXmRXSWlG1a2Ta2jpaqdQ89s27bfugrYklo3q3lrCw7tUfg72mar062eoReL66fBrfpJP769vxaJIKyklIKWIFJI3MhSK4KSRnmcRY8JxmlCElCRQNj6pqjrmqacUQ4wVthKQVhjIjR-B8sze-9dYrH8pOe6HatjLK9r7EkFCcFyjPI8o2qHDWe6eacu50V7lliWC5yrz8m3m5yTzOnW1P9HWn5O_UT8gRgBvA9vN_7vwGWo2YAg</recordid><startdate>20181015</startdate><enddate>20181015</enddate><creator>Pan, Xiaoyong</creator><creator>Shen, Hong-Bin</creator><general>Oxford University Press</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-4029-3325</orcidid></search><sort><creationdate>20181015</creationdate><title>Predicting RNA–protein binding sites and motifs through combining local and global deep convolutional neural networks</title><author>Pan, Xiaoyong ; Shen, Hong-Bin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c463t-415903fd8ce32f9d657d22c6847451ed30cf61edeb626eb4a515694fcd4a20d73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Pan, Xiaoyong</creatorcontrib><creatorcontrib>Shen, Hong-Bin</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Pan, Xiaoyong</au><au>Shen, Hong-Bin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Predicting RNA–protein binding sites and motifs through combining local and global deep convolutional neural networks</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2018-10-15</date><risdate>2018</risdate><volume>34</volume><issue>20</issue><spage>3427</spage><epage>3436</epage><pages>3427-3436</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><abstract>Abstract Motivation RNA-binding proteins (RBPs) take over 5–10% of the eukaryotic proteome and play key roles in many biological processes, e.g. gene regulation. Experimental detection of RBP binding sites is still time-intensive and high-costly. Instead, computational prediction of the RBP binding sites using patterns learned from existing annotation knowledge is a fast approach. From the biological point of view, the local structure context derived from local sequences will be recognized by specific RBPs. However, in computational modeling using deep learning, to our best knowledge, only global representations of entire RNA sequences are employed. So far, the local sequence information is ignored in the deep model construction process. Results In this study, we present a computational method iDeepE to predict RNA–protein binding sites from RNA sequences by combining global and local convolutional neural networks (CNNs). For the global CNN, we pad the RNA sequences into the same length. For the local CNN, we split a RNA sequence into multiple overlapping fixed-length subsequences, where each subsequence is a signal channel of the whole sequence. Next, we train deep CNNs for multiple subsequences and the padded sequences to learn high-level features, respectively. Finally, the outputs from local and global CNNs are combined to improve the prediction. iDeepE demonstrates a better performance over state-of-the-art methods on two large-scale datasets derived from CLIP-seq. We also find that the local CNN runs 1.8 times faster than the global CNN with comparable performance when using GPUs. Our results show that iDeepE has captured experimentally verified binding motifs. Availability and implementation https://github.com/xypan1232/iDeepE Supplementary information Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>29722865</pmid><doi>10.1093/bioinformatics/bty364</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-4029-3325</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1367-4803
ispartof Bioinformatics, 2018-10, Vol.34 (20), p.3427-3436
issn 1367-4803
1460-2059
1367-4811
language eng
recordid cdi_proquest_miscellaneous_2034289188
source Oxford Journals Open Access Collection
title Predicting RNA–protein binding sites and motifs through combining local and global deep convolutional neural networks
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T07%3A42%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_TOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Predicting%20RNA%E2%80%93protein%20binding%20sites%20and%20motifs%20through%20combining%20local%20and%20global%20deep%20convolutional%20neural%20networks&rft.jtitle=Bioinformatics&rft.au=Pan,%20Xiaoyong&rft.date=2018-10-15&rft.volume=34&rft.issue=20&rft.spage=3427&rft.epage=3436&rft.pages=3427-3436&rft.issn=1367-4803&rft.eissn=1460-2059&rft_id=info:doi/10.1093/bioinformatics/bty364&rft_dat=%3Cproquest_TOX%3E2034289188%3C/proquest_TOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2034289188&rft_id=info:pmid/29722865&rft_oup_id=10.1093/bioinformatics/bty364&rfr_iscdi=true