Assessing Computational Steps for CLIP-Seq Data Analysis
RNA-binding protein (RBP) is a key player in regulating gene expression at the posttranscriptional level. CLIP-Seq, with the ability to provide a genome-wide map of protein-RNA interactions, has been increasingly used to decipher RBP-mediated posttranscriptional regulation. Generating highly reliabl...
Gespeichert in:
Veröffentlicht in: | BioMed research international 2015-01, Vol.2015 (2015), p.1-10 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 10 |
---|---|
container_issue | 2015 |
container_start_page | 1 |
container_title | BioMed research international |
container_volume | 2015 |
creator | Rustgi, Anil K. Madison, Blair B. Zhong, Xue Liu, Qi Shyr, Yu |
description | RNA-binding protein (RBP) is a key player in regulating gene expression at the posttranscriptional level. CLIP-Seq, with the ability to provide a genome-wide map of protein-RNA interactions, has been increasingly used to decipher RBP-mediated posttranscriptional regulation. Generating highly reliable binding sites from CLIP-Seq requires not only stringent library preparation but also considerable computational efforts. Here we presented a first systematic evaluation of major computational steps for identifying RBP binding sites from CLIP-Seq data, including preprocessing, the choice of control samples, peak normalization, and motif discovery. We found that avoiding PCR amplification artifacts, normalizing to input RNA or mRNAseq, and defining the background model from control samples can reduce the bias introduced by RNA abundance and improve the quality of detected binding sites. Our findings can serve as a general guideline for CLIP experiments design and the comprehensive analysis of CLIP-Seq data. |
doi_str_mv | 10.1155/2015/196082 |
format | Article |
fullrecord | <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4619761</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A458160496</galeid><sourcerecordid>A458160496</sourcerecordid><originalsourceid>FETCH-LOGICAL-c528t-35cef064f8d67bbdd370cc830b801920cfc2d9351ca7d970c9b95d9cce0e9c153</originalsourceid><addsrcrecordid>eNqNkd1LwzAUxYMoTuaefJeCL6LU5aNJkxdhzK_BQGH6HNI0nZGumU2r7L83dTqnT-blBs6Pw7n3AHCE4AVClA4xRHSIBIMc74ADTFASM5Sg3c2fkB4YeP8Cw-OIQcH2QQ8zSkTC-AHgI--N97aaR2O3WLaNaqyrVBnNGrP0UeHqaDydPMQz8xpdqUZFoyCuvPWHYK9QpTeDr9kHTzfXj-O7eHp_OxmPprGmmDcxodoUkCUFz1maZXlOUqg1JzDjEAkMdaFxLghFWqW5CJrIBM2F1gYaoRElfXC59l222cLk2lRNrUq5rO1C1SvplJW_lco-y7l7kwlDIg3798Hpl0HtXlvjG7mwXpuyVJVxrZcoJZhjmhL-HxSlArPPWCd_0BfX1uE2HYUZ44QK9kPNVWmkrQoXIurOVI4S2tWRfFLna0rXzvvaFJvtEJRdy7JrWa5bDvTx9kE27HenAThbA8-2ytW7_Z-bCYgp1BZMQ0JKPgAmJ7Wj</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1726683596</pqid></control><display><type>article</type><title>Assessing Computational Steps for CLIP-Seq Data Analysis</title><source>MEDLINE</source><source>PubMed Central Open Access</source><source>Wiley Online Library Open Access</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Rustgi, Anil K. ; Madison, Blair B. ; Zhong, Xue ; Liu, Qi ; Shyr, Yu</creator><contributor>Choi, Cheol Yong</contributor><creatorcontrib>Rustgi, Anil K. ; Madison, Blair B. ; Zhong, Xue ; Liu, Qi ; Shyr, Yu ; Choi, Cheol Yong</creatorcontrib><description>RNA-binding protein (RBP) is a key player in regulating gene expression at the posttranscriptional level. CLIP-Seq, with the ability to provide a genome-wide map of protein-RNA interactions, has been increasingly used to decipher RBP-mediated posttranscriptional regulation. Generating highly reliable binding sites from CLIP-Seq requires not only stringent library preparation but also considerable computational efforts. Here we presented a first systematic evaluation of major computational steps for identifying RBP binding sites from CLIP-Seq data, including preprocessing, the choice of control samples, peak normalization, and motif discovery. We found that avoiding PCR amplification artifacts, normalizing to input RNA or mRNAseq, and defining the background model from control samples can reduce the bias introduced by RNA abundance and improve the quality of detected binding sites. Our findings can serve as a general guideline for CLIP experiments design and the comprehensive analysis of CLIP-Seq data.</description><identifier>ISSN: 2314-6133</identifier><identifier>EISSN: 2314-6141</identifier><identifier>DOI: 10.1155/2015/196082</identifier><identifier>PMID: 26539468</identifier><language>eng</language><publisher>Cairo, Egypt: Hindawi Publishing Corporation</publisher><subject>Academic libraries ; Binding proteins ; Binding Sites - genetics ; Caco-2 Cells ; Data analysis ; DNA sequencing ; Gene expression ; Gene Expression Regulation ; Genomes ; High-Throughput Nucleotide Sequencing - methods ; Humans ; Medicine ; Methods ; MicroRNAs - genetics ; Nucleotide sequencing ; Observations ; Physiological aspects ; Proteins ; RNA, Messenger - genetics ; RNA, Messenger - metabolism ; RNA-Binding Proteins - genetics ; RNA-Binding Proteins - metabolism ; Sequence Analysis, RNA ; Studies</subject><ispartof>BioMed research international, 2015-01, Vol.2015 (2015), p.1-10</ispartof><rights>Copyright © 2015 Qi Liu et al.</rights><rights>COPYRIGHT 2015 John Wiley & Sons, Inc.</rights><rights>Copyright © 2015 Qi Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</rights><rights>Copyright © 2015 Qi Liu et al. 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c528t-35cef064f8d67bbdd370cc830b801920cfc2d9351ca7d970c9b95d9cce0e9c153</citedby><cites>FETCH-LOGICAL-c528t-35cef064f8d67bbdd370cc830b801920cfc2d9351ca7d970c9b95d9cce0e9c153</cites><orcidid>0000-0003-2086-9670</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619761/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619761/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26539468$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Choi, Cheol Yong</contributor><creatorcontrib>Rustgi, Anil K.</creatorcontrib><creatorcontrib>Madison, Blair B.</creatorcontrib><creatorcontrib>Zhong, Xue</creatorcontrib><creatorcontrib>Liu, Qi</creatorcontrib><creatorcontrib>Shyr, Yu</creatorcontrib><title>Assessing Computational Steps for CLIP-Seq Data Analysis</title><title>BioMed research international</title><addtitle>Biomed Res Int</addtitle><description>RNA-binding protein (RBP) is a key player in regulating gene expression at the posttranscriptional level. CLIP-Seq, with the ability to provide a genome-wide map of protein-RNA interactions, has been increasingly used to decipher RBP-mediated posttranscriptional regulation. Generating highly reliable binding sites from CLIP-Seq requires not only stringent library preparation but also considerable computational efforts. Here we presented a first systematic evaluation of major computational steps for identifying RBP binding sites from CLIP-Seq data, including preprocessing, the choice of control samples, peak normalization, and motif discovery. We found that avoiding PCR amplification artifacts, normalizing to input RNA or mRNAseq, and defining the background model from control samples can reduce the bias introduced by RNA abundance and improve the quality of detected binding sites. Our findings can serve as a general guideline for CLIP experiments design and the comprehensive analysis of CLIP-Seq data.</description><subject>Academic libraries</subject><subject>Binding proteins</subject><subject>Binding Sites - genetics</subject><subject>Caco-2 Cells</subject><subject>Data analysis</subject><subject>DNA sequencing</subject><subject>Gene expression</subject><subject>Gene Expression Regulation</subject><subject>Genomes</subject><subject>High-Throughput Nucleotide Sequencing - methods</subject><subject>Humans</subject><subject>Medicine</subject><subject>Methods</subject><subject>MicroRNAs - genetics</subject><subject>Nucleotide sequencing</subject><subject>Observations</subject><subject>Physiological aspects</subject><subject>Proteins</subject><subject>RNA, Messenger - genetics</subject><subject>RNA, Messenger - metabolism</subject><subject>RNA-Binding Proteins - genetics</subject><subject>RNA-Binding Proteins - metabolism</subject><subject>Sequence Analysis, RNA</subject><subject>Studies</subject><issn>2314-6133</issn><issn>2314-6141</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>RHX</sourceid><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNqNkd1LwzAUxYMoTuaefJeCL6LU5aNJkxdhzK_BQGH6HNI0nZGumU2r7L83dTqnT-blBs6Pw7n3AHCE4AVClA4xRHSIBIMc74ADTFASM5Sg3c2fkB4YeP8Cw-OIQcH2QQ8zSkTC-AHgI--N97aaR2O3WLaNaqyrVBnNGrP0UeHqaDydPMQz8xpdqUZFoyCuvPWHYK9QpTeDr9kHTzfXj-O7eHp_OxmPprGmmDcxodoUkCUFz1maZXlOUqg1JzDjEAkMdaFxLghFWqW5CJrIBM2F1gYaoRElfXC59l222cLk2lRNrUq5rO1C1SvplJW_lco-y7l7kwlDIg3798Hpl0HtXlvjG7mwXpuyVJVxrZcoJZhjmhL-HxSlArPPWCd_0BfX1uE2HYUZ44QK9kPNVWmkrQoXIurOVI4S2tWRfFLna0rXzvvaFJvtEJRdy7JrWa5bDvTx9kE27HenAThbA8-2ytW7_Z-bCYgp1BZMQ0JKPgAmJ7Wj</recordid><startdate>20150101</startdate><enddate>20150101</enddate><creator>Rustgi, Anil K.</creator><creator>Madison, Blair B.</creator><creator>Zhong, Xue</creator><creator>Liu, Qi</creator><creator>Shyr, Yu</creator><general>Hindawi Publishing Corporation</general><general>John Wiley & Sons, Inc</general><general>Hindawi Limited</general><scope>ADJCN</scope><scope>AHFXO</scope><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QL</scope><scope>7QO</scope><scope>7T7</scope><scope>7TK</scope><scope>7U7</scope><scope>7U9</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>CWDGH</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M7N</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-2086-9670</orcidid></search><sort><creationdate>20150101</creationdate><title>Assessing Computational Steps for CLIP-Seq Data Analysis</title><author>Rustgi, Anil K. ; Madison, Blair B. ; Zhong, Xue ; Liu, Qi ; Shyr, Yu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c528t-35cef064f8d67bbdd370cc830b801920cfc2d9351ca7d970c9b95d9cce0e9c153</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Academic libraries</topic><topic>Binding proteins</topic><topic>Binding Sites - genetics</topic><topic>Caco-2 Cells</topic><topic>Data analysis</topic><topic>DNA sequencing</topic><topic>Gene expression</topic><topic>Gene Expression Regulation</topic><topic>Genomes</topic><topic>High-Throughput Nucleotide Sequencing - methods</topic><topic>Humans</topic><topic>Medicine</topic><topic>Methods</topic><topic>MicroRNAs - genetics</topic><topic>Nucleotide sequencing</topic><topic>Observations</topic><topic>Physiological aspects</topic><topic>Proteins</topic><topic>RNA, Messenger - genetics</topic><topic>RNA, Messenger - metabolism</topic><topic>RNA-Binding Proteins - genetics</topic><topic>RNA-Binding Proteins - metabolism</topic><topic>Sequence Analysis, RNA</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rustgi, Anil K.</creatorcontrib><creatorcontrib>Madison, Blair B.</creatorcontrib><creatorcontrib>Zhong, Xue</creatorcontrib><creatorcontrib>Liu, Qi</creatorcontrib><creatorcontrib>Shyr, Yu</creatorcontrib><collection>الدوريات العلمية والإحصائية - e-Marefa Academic and Statistical Periodicals</collection><collection>معرفة - المحتوى العربي الأكاديمي المتكامل - e-Marefa Academic Complete</collection><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Neurosciences Abstracts</collection><collection>Toxicology Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>Middle East & Africa Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>BioMed research international</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rustgi, Anil K.</au><au>Madison, Blair B.</au><au>Zhong, Xue</au><au>Liu, Qi</au><au>Shyr, Yu</au><au>Choi, Cheol Yong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Assessing Computational Steps for CLIP-Seq Data Analysis</atitle><jtitle>BioMed research international</jtitle><addtitle>Biomed Res Int</addtitle><date>2015-01-01</date><risdate>2015</risdate><volume>2015</volume><issue>2015</issue><spage>1</spage><epage>10</epage><pages>1-10</pages><issn>2314-6133</issn><eissn>2314-6141</eissn><abstract>RNA-binding protein (RBP) is a key player in regulating gene expression at the posttranscriptional level. CLIP-Seq, with the ability to provide a genome-wide map of protein-RNA interactions, has been increasingly used to decipher RBP-mediated posttranscriptional regulation. Generating highly reliable binding sites from CLIP-Seq requires not only stringent library preparation but also considerable computational efforts. Here we presented a first systematic evaluation of major computational steps for identifying RBP binding sites from CLIP-Seq data, including preprocessing, the choice of control samples, peak normalization, and motif discovery. We found that avoiding PCR amplification artifacts, normalizing to input RNA or mRNAseq, and defining the background model from control samples can reduce the bias introduced by RNA abundance and improve the quality of detected binding sites. Our findings can serve as a general guideline for CLIP experiments design and the comprehensive analysis of CLIP-Seq data.</abstract><cop>Cairo, Egypt</cop><pub>Hindawi Publishing Corporation</pub><pmid>26539468</pmid><doi>10.1155/2015/196082</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0003-2086-9670</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2314-6133 |
ispartof | BioMed research international, 2015-01, Vol.2015 (2015), p.1-10 |
issn | 2314-6133 2314-6141 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4619761 |
source | MEDLINE; PubMed Central Open Access; Wiley Online Library Open Access; PubMed Central; Alma/SFX Local Collection |
subjects | Academic libraries Binding proteins Binding Sites - genetics Caco-2 Cells Data analysis DNA sequencing Gene expression Gene Expression Regulation Genomes High-Throughput Nucleotide Sequencing - methods Humans Medicine Methods MicroRNAs - genetics Nucleotide sequencing Observations Physiological aspects Proteins RNA, Messenger - genetics RNA, Messenger - metabolism RNA-Binding Proteins - genetics RNA-Binding Proteins - metabolism Sequence Analysis, RNA Studies |
title | Assessing Computational Steps for CLIP-Seq Data Analysis |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T18%3A30%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Assessing%20Computational%20Steps%20for%20CLIP-Seq%20Data%20Analysis&rft.jtitle=BioMed%20research%20international&rft.au=Rustgi,%20Anil%20K.&rft.date=2015-01-01&rft.volume=2015&rft.issue=2015&rft.spage=1&rft.epage=10&rft.pages=1-10&rft.issn=2314-6133&rft.eissn=2314-6141&rft_id=info:doi/10.1155/2015/196082&rft_dat=%3Cgale_pubme%3EA458160496%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1726683596&rft_id=info:pmid/26539468&rft_galeid=A458160496&rfr_iscdi=true |