Assessing Computational Steps for CLIP-Seq Data Analysis

RNA-binding protein (RBP) is a key player in regulating gene expression at the posttranscriptional level. CLIP-Seq, with the ability to provide a genome-wide map of protein-RNA interactions, has been increasingly used to decipher RBP-mediated posttranscriptional regulation. Generating highly reliabl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:BioMed research international 2015-01, Vol.2015 (2015), p.1-10
Hauptverfasser: Rustgi, Anil K., Madison, Blair B., Zhong, Xue, Liu, Qi, Shyr, Yu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 10
container_issue 2015
container_start_page 1
container_title BioMed research international
container_volume 2015
creator Rustgi, Anil K.
Madison, Blair B.
Zhong, Xue
Liu, Qi
Shyr, Yu
description RNA-binding protein (RBP) is a key player in regulating gene expression at the posttranscriptional level. CLIP-Seq, with the ability to provide a genome-wide map of protein-RNA interactions, has been increasingly used to decipher RBP-mediated posttranscriptional regulation. Generating highly reliable binding sites from CLIP-Seq requires not only stringent library preparation but also considerable computational efforts. Here we presented a first systematic evaluation of major computational steps for identifying RBP binding sites from CLIP-Seq data, including preprocessing, the choice of control samples, peak normalization, and motif discovery. We found that avoiding PCR amplification artifacts, normalizing to input RNA or mRNAseq, and defining the background model from control samples can reduce the bias introduced by RNA abundance and improve the quality of detected binding sites. Our findings can serve as a general guideline for CLIP experiments design and the comprehensive analysis of CLIP-Seq data.
doi_str_mv 10.1155/2015/196082
format Article
fullrecord <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4619761</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A458160496</galeid><sourcerecordid>A458160496</sourcerecordid><originalsourceid>FETCH-LOGICAL-c528t-35cef064f8d67bbdd370cc830b801920cfc2d9351ca7d970c9b95d9cce0e9c153</originalsourceid><addsrcrecordid>eNqNkd1LwzAUxYMoTuaefJeCL6LU5aNJkxdhzK_BQGH6HNI0nZGumU2r7L83dTqnT-blBs6Pw7n3AHCE4AVClA4xRHSIBIMc74ADTFASM5Sg3c2fkB4YeP8Cw-OIQcH2QQ8zSkTC-AHgI--N97aaR2O3WLaNaqyrVBnNGrP0UeHqaDydPMQz8xpdqUZFoyCuvPWHYK9QpTeDr9kHTzfXj-O7eHp_OxmPprGmmDcxodoUkCUFz1maZXlOUqg1JzDjEAkMdaFxLghFWqW5CJrIBM2F1gYaoRElfXC59l222cLk2lRNrUq5rO1C1SvplJW_lco-y7l7kwlDIg3798Hpl0HtXlvjG7mwXpuyVJVxrZcoJZhjmhL-HxSlArPPWCd_0BfX1uE2HYUZ44QK9kPNVWmkrQoXIurOVI4S2tWRfFLna0rXzvvaFJvtEJRdy7JrWa5bDvTx9kE27HenAThbA8-2ytW7_Z-bCYgp1BZMQ0JKPgAmJ7Wj</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1726683596</pqid></control><display><type>article</type><title>Assessing Computational Steps for CLIP-Seq Data Analysis</title><source>MEDLINE</source><source>PubMed Central Open Access</source><source>Wiley Online Library Open Access</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Rustgi, Anil K. ; Madison, Blair B. ; Zhong, Xue ; Liu, Qi ; Shyr, Yu</creator><contributor>Choi, Cheol Yong</contributor><creatorcontrib>Rustgi, Anil K. ; Madison, Blair B. ; Zhong, Xue ; Liu, Qi ; Shyr, Yu ; Choi, Cheol Yong</creatorcontrib><description>RNA-binding protein (RBP) is a key player in regulating gene expression at the posttranscriptional level. CLIP-Seq, with the ability to provide a genome-wide map of protein-RNA interactions, has been increasingly used to decipher RBP-mediated posttranscriptional regulation. Generating highly reliable binding sites from CLIP-Seq requires not only stringent library preparation but also considerable computational efforts. Here we presented a first systematic evaluation of major computational steps for identifying RBP binding sites from CLIP-Seq data, including preprocessing, the choice of control samples, peak normalization, and motif discovery. We found that avoiding PCR amplification artifacts, normalizing to input RNA or mRNAseq, and defining the background model from control samples can reduce the bias introduced by RNA abundance and improve the quality of detected binding sites. Our findings can serve as a general guideline for CLIP experiments design and the comprehensive analysis of CLIP-Seq data.</description><identifier>ISSN: 2314-6133</identifier><identifier>EISSN: 2314-6141</identifier><identifier>DOI: 10.1155/2015/196082</identifier><identifier>PMID: 26539468</identifier><language>eng</language><publisher>Cairo, Egypt: Hindawi Publishing Corporation</publisher><subject>Academic libraries ; Binding proteins ; Binding Sites - genetics ; Caco-2 Cells ; Data analysis ; DNA sequencing ; Gene expression ; Gene Expression Regulation ; Genomes ; High-Throughput Nucleotide Sequencing - methods ; Humans ; Medicine ; Methods ; MicroRNAs - genetics ; Nucleotide sequencing ; Observations ; Physiological aspects ; Proteins ; RNA, Messenger - genetics ; RNA, Messenger - metabolism ; RNA-Binding Proteins - genetics ; RNA-Binding Proteins - metabolism ; Sequence Analysis, RNA ; Studies</subject><ispartof>BioMed research international, 2015-01, Vol.2015 (2015), p.1-10</ispartof><rights>Copyright © 2015 Qi Liu et al.</rights><rights>COPYRIGHT 2015 John Wiley &amp; Sons, Inc.</rights><rights>Copyright © 2015 Qi Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</rights><rights>Copyright © 2015 Qi Liu et al. 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c528t-35cef064f8d67bbdd370cc830b801920cfc2d9351ca7d970c9b95d9cce0e9c153</citedby><cites>FETCH-LOGICAL-c528t-35cef064f8d67bbdd370cc830b801920cfc2d9351ca7d970c9b95d9cce0e9c153</cites><orcidid>0000-0003-2086-9670</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619761/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619761/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26539468$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Choi, Cheol Yong</contributor><creatorcontrib>Rustgi, Anil K.</creatorcontrib><creatorcontrib>Madison, Blair B.</creatorcontrib><creatorcontrib>Zhong, Xue</creatorcontrib><creatorcontrib>Liu, Qi</creatorcontrib><creatorcontrib>Shyr, Yu</creatorcontrib><title>Assessing Computational Steps for CLIP-Seq Data Analysis</title><title>BioMed research international</title><addtitle>Biomed Res Int</addtitle><description>RNA-binding protein (RBP) is a key player in regulating gene expression at the posttranscriptional level. CLIP-Seq, with the ability to provide a genome-wide map of protein-RNA interactions, has been increasingly used to decipher RBP-mediated posttranscriptional regulation. Generating highly reliable binding sites from CLIP-Seq requires not only stringent library preparation but also considerable computational efforts. Here we presented a first systematic evaluation of major computational steps for identifying RBP binding sites from CLIP-Seq data, including preprocessing, the choice of control samples, peak normalization, and motif discovery. We found that avoiding PCR amplification artifacts, normalizing to input RNA or mRNAseq, and defining the background model from control samples can reduce the bias introduced by RNA abundance and improve the quality of detected binding sites. Our findings can serve as a general guideline for CLIP experiments design and the comprehensive analysis of CLIP-Seq data.</description><subject>Academic libraries</subject><subject>Binding proteins</subject><subject>Binding Sites - genetics</subject><subject>Caco-2 Cells</subject><subject>Data analysis</subject><subject>DNA sequencing</subject><subject>Gene expression</subject><subject>Gene Expression Regulation</subject><subject>Genomes</subject><subject>High-Throughput Nucleotide Sequencing - methods</subject><subject>Humans</subject><subject>Medicine</subject><subject>Methods</subject><subject>MicroRNAs - genetics</subject><subject>Nucleotide sequencing</subject><subject>Observations</subject><subject>Physiological aspects</subject><subject>Proteins</subject><subject>RNA, Messenger - genetics</subject><subject>RNA, Messenger - metabolism</subject><subject>RNA-Binding Proteins - genetics</subject><subject>RNA-Binding Proteins - metabolism</subject><subject>Sequence Analysis, RNA</subject><subject>Studies</subject><issn>2314-6133</issn><issn>2314-6141</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>RHX</sourceid><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNqNkd1LwzAUxYMoTuaefJeCL6LU5aNJkxdhzK_BQGH6HNI0nZGumU2r7L83dTqnT-blBs6Pw7n3AHCE4AVClA4xRHSIBIMc74ADTFASM5Sg3c2fkB4YeP8Cw-OIQcH2QQ8zSkTC-AHgI--N97aaR2O3WLaNaqyrVBnNGrP0UeHqaDydPMQz8xpdqUZFoyCuvPWHYK9QpTeDr9kHTzfXj-O7eHp_OxmPprGmmDcxodoUkCUFz1maZXlOUqg1JzDjEAkMdaFxLghFWqW5CJrIBM2F1gYaoRElfXC59l222cLk2lRNrUq5rO1C1SvplJW_lco-y7l7kwlDIg3798Hpl0HtXlvjG7mwXpuyVJVxrZcoJZhjmhL-HxSlArPPWCd_0BfX1uE2HYUZ44QK9kPNVWmkrQoXIurOVI4S2tWRfFLna0rXzvvaFJvtEJRdy7JrWa5bDvTx9kE27HenAThbA8-2ytW7_Z-bCYgp1BZMQ0JKPgAmJ7Wj</recordid><startdate>20150101</startdate><enddate>20150101</enddate><creator>Rustgi, Anil K.</creator><creator>Madison, Blair B.</creator><creator>Zhong, Xue</creator><creator>Liu, Qi</creator><creator>Shyr, Yu</creator><general>Hindawi Publishing Corporation</general><general>John Wiley &amp; Sons, Inc</general><general>Hindawi Limited</general><scope>ADJCN</scope><scope>AHFXO</scope><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QL</scope><scope>7QO</scope><scope>7T7</scope><scope>7TK</scope><scope>7U7</scope><scope>7U9</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>CWDGH</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M7N</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-2086-9670</orcidid></search><sort><creationdate>20150101</creationdate><title>Assessing Computational Steps for CLIP-Seq Data Analysis</title><author>Rustgi, Anil K. ; Madison, Blair B. ; Zhong, Xue ; Liu, Qi ; Shyr, Yu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c528t-35cef064f8d67bbdd370cc830b801920cfc2d9351ca7d970c9b95d9cce0e9c153</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Academic libraries</topic><topic>Binding proteins</topic><topic>Binding Sites - genetics</topic><topic>Caco-2 Cells</topic><topic>Data analysis</topic><topic>DNA sequencing</topic><topic>Gene expression</topic><topic>Gene Expression Regulation</topic><topic>Genomes</topic><topic>High-Throughput Nucleotide Sequencing - methods</topic><topic>Humans</topic><topic>Medicine</topic><topic>Methods</topic><topic>MicroRNAs - genetics</topic><topic>Nucleotide sequencing</topic><topic>Observations</topic><topic>Physiological aspects</topic><topic>Proteins</topic><topic>RNA, Messenger - genetics</topic><topic>RNA, Messenger - metabolism</topic><topic>RNA-Binding Proteins - genetics</topic><topic>RNA-Binding Proteins - metabolism</topic><topic>Sequence Analysis, RNA</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rustgi, Anil K.</creatorcontrib><creatorcontrib>Madison, Blair B.</creatorcontrib><creatorcontrib>Zhong, Xue</creatorcontrib><creatorcontrib>Liu, Qi</creatorcontrib><creatorcontrib>Shyr, Yu</creatorcontrib><collection>الدوريات العلمية والإحصائية - e-Marefa Academic and Statistical Periodicals</collection><collection>معرفة - المحتوى العربي الأكاديمي المتكامل - e-Marefa Academic Complete</collection><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Neurosciences Abstracts</collection><collection>Toxicology Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>Middle East &amp; Africa Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>BioMed research international</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rustgi, Anil K.</au><au>Madison, Blair B.</au><au>Zhong, Xue</au><au>Liu, Qi</au><au>Shyr, Yu</au><au>Choi, Cheol Yong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Assessing Computational Steps for CLIP-Seq Data Analysis</atitle><jtitle>BioMed research international</jtitle><addtitle>Biomed Res Int</addtitle><date>2015-01-01</date><risdate>2015</risdate><volume>2015</volume><issue>2015</issue><spage>1</spage><epage>10</epage><pages>1-10</pages><issn>2314-6133</issn><eissn>2314-6141</eissn><abstract>RNA-binding protein (RBP) is a key player in regulating gene expression at the posttranscriptional level. CLIP-Seq, with the ability to provide a genome-wide map of protein-RNA interactions, has been increasingly used to decipher RBP-mediated posttranscriptional regulation. Generating highly reliable binding sites from CLIP-Seq requires not only stringent library preparation but also considerable computational efforts. Here we presented a first systematic evaluation of major computational steps for identifying RBP binding sites from CLIP-Seq data, including preprocessing, the choice of control samples, peak normalization, and motif discovery. We found that avoiding PCR amplification artifacts, normalizing to input RNA or mRNAseq, and defining the background model from control samples can reduce the bias introduced by RNA abundance and improve the quality of detected binding sites. Our findings can serve as a general guideline for CLIP experiments design and the comprehensive analysis of CLIP-Seq data.</abstract><cop>Cairo, Egypt</cop><pub>Hindawi Publishing Corporation</pub><pmid>26539468</pmid><doi>10.1155/2015/196082</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0003-2086-9670</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2314-6133
ispartof BioMed research international, 2015-01, Vol.2015 (2015), p.1-10
issn 2314-6133
2314-6141
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4619761
source MEDLINE; PubMed Central Open Access; Wiley Online Library Open Access; PubMed Central; Alma/SFX Local Collection
subjects Academic libraries
Binding proteins
Binding Sites - genetics
Caco-2 Cells
Data analysis
DNA sequencing
Gene expression
Gene Expression Regulation
Genomes
High-Throughput Nucleotide Sequencing - methods
Humans
Medicine
Methods
MicroRNAs - genetics
Nucleotide sequencing
Observations
Physiological aspects
Proteins
RNA, Messenger - genetics
RNA, Messenger - metabolism
RNA-Binding Proteins - genetics
RNA-Binding Proteins - metabolism
Sequence Analysis, RNA
Studies
title Assessing Computational Steps for CLIP-Seq Data Analysis
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T18%3A30%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Assessing%20Computational%20Steps%20for%20CLIP-Seq%20Data%20Analysis&rft.jtitle=BioMed%20research%20international&rft.au=Rustgi,%20Anil%20K.&rft.date=2015-01-01&rft.volume=2015&rft.issue=2015&rft.spage=1&rft.epage=10&rft.pages=1-10&rft.issn=2314-6133&rft.eissn=2314-6141&rft_id=info:doi/10.1155/2015/196082&rft_dat=%3Cgale_pubme%3EA458160496%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1726683596&rft_id=info:pmid/26539468&rft_galeid=A458160496&rfr_iscdi=true