Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome
The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate it...
Gespeichert in:
Veröffentlicht in: | Scientific reports 2016-06, Vol.6 (1), p.27722-27722, Article 27722 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 27722 |
---|---|
container_issue | 1 |
container_start_page | 27722 |
container_title | Scientific reports |
container_volume | 6 |
creator | Fonville, Natalie C. Velmurugan, Karthik Raja Tae, Hongseok Vaksman, Zalman McIver, Lauren J. Garner, Harold R. |
description | The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate its utility in identifying contigs from reads that do not map to the genome reference. The analysis of these samples yielded 790 novel extra-referential concordant contigs that are observed in more than one sample. We searched for evidence of functional elements in the concordant contigs in two ways: (1) BLAST-ing each contig against normal RNA-Seq samples, (2) Checking for predicted functional elements using GlimmerHMM. Of the 790 concordant contigs, 37 had an exact match to at least one RNA-Seq read; 15 aligned to more than 100 RNA-Seq reads. Of the 249 concordant contigs predicted by GlimmerHMM to have functional elements, 6 had at least one exact RNA-Seq match. BLAST-ing these novel contigs against all publically available sequences confirmed that they were found in human and chimpanzee BAC and FOSMID clones sequenced as part of the original human genome project. These extra-referential contigs predominantly contained pentameric repeats, especially two motifs: AATGG and GTGGA. |
doi_str_mv | 10.1038/srep27722 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4899811</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>4084347441</sourcerecordid><originalsourceid>FETCH-LOGICAL-c438t-6e5d8597644cec62aa0ab092afe364b7b852840f9132407fea5afc26eafb1dce3</originalsourceid><addsrcrecordid>eNplkVFrFDEUhYMottQ--Ack4IuKo0kmmSQ-CFLaKhR80eeQmbnZTZlJ1iRT2H9vhq3LqnlJuOfLuZd7EHpJyQdKWvUxJ9gxKRl7gs4Z4aJhLWNPT95n6DLne1KPYJpT_RydMcmk6jp9jva3EOLsBzyBK_EBUv6E_QiheLf3YYNDrU24AilmW2CafIH8Hq9kUxsnyJWFEc-x_sjYhhG7JQzFx2AnDBPMVc_YB1y2gLfLbAPerC3hBXrm7JTh8vG-QD9vrn9cfW3uvt9-u_py1wy8VaXpQIxKaNlxPsDQMWuJ7Ylm1kHb8V72SjDFidO0ZZxIB1ZYN7AOrOvpOEB7gT4ffHdLP0OthJLsZHbJzzbtTbTe_K0EvzWb-GC40lpRWg3ePBqk-GuBXMzs81BXYQPEJRsqtVBrALyir_9B7-OS6iYOlGgFkaJSbw_UutQanzsOQ4lZMzXHTCv76nT6I_knwQq8OwC5SmED6aTlf26_Abg6rwc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1795535075</pqid></control><display><type>article</type><title>Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome</title><source>MEDLINE</source><source>Nature Free</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><source>Springer Nature OA Free Journals</source><creator>Fonville, Natalie C. ; Velmurugan, Karthik Raja ; Tae, Hongseok ; Vaksman, Zalman ; McIver, Lauren J. ; Garner, Harold R.</creator><creatorcontrib>Fonville, Natalie C. ; Velmurugan, Karthik Raja ; Tae, Hongseok ; Vaksman, Zalman ; McIver, Lauren J. ; Garner, Harold R.</creatorcontrib><description>The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate its utility in identifying contigs from reads that do not map to the genome reference. The analysis of these samples yielded 790 novel extra-referential concordant contigs that are observed in more than one sample. We searched for evidence of functional elements in the concordant contigs in two ways: (1) BLAST-ing each contig against normal RNA-Seq samples, (2) Checking for predicted functional elements using GlimmerHMM. Of the 790 concordant contigs, 37 had an exact match to at least one RNA-Seq read; 15 aligned to more than 100 RNA-Seq reads. Of the 249 concordant contigs predicted by GlimmerHMM to have functional elements, 6 had at least one exact RNA-Seq match. BLAST-ing these novel contigs against all publically available sequences confirmed that they were found in human and chimpanzee BAC and FOSMID clones sequenced as part of the original human genome project. These extra-referential contigs predominantly contained pentameric repeats, especially two motifs: AATGG and GTGGA.</description><identifier>ISSN: 2045-2322</identifier><identifier>EISSN: 2045-2322</identifier><identifier>DOI: 10.1038/srep27722</identifier><identifier>PMID: 27278669</identifier><language>eng</language><publisher>London: Nature Publishing Group UK</publisher><subject>631/114/2785/2302 ; 631/208/212/2302 ; Algorithms ; Animals ; Cell Line ; Contig Mapping ; Design ; Genome, Human ; Genomes ; Genomics ; Humanities and Social Sciences ; Humans ; Microsatellite Repeats ; multidisciplinary ; Pan troglodytes - genetics ; Science ; Sequence Analysis, DNA - methods ; Sequence Analysis, RNA - methods</subject><ispartof>Scientific reports, 2016-06, Vol.6 (1), p.27722-27722, Article 27722</ispartof><rights>The Author(s) 2016</rights><rights>Copyright Nature Publishing Group Jun 2016</rights><rights>Copyright © 2016, Macmillan Publishers Limited 2016 Macmillan Publishers Limited</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c438t-6e5d8597644cec62aa0ab092afe364b7b852840f9132407fea5afc26eafb1dce3</citedby><cites>FETCH-LOGICAL-c438t-6e5d8597644cec62aa0ab092afe364b7b852840f9132407fea5afc26eafb1dce3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4899811/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4899811/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,27901,27902,41096,42165,51551,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/27278669$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Fonville, Natalie C.</creatorcontrib><creatorcontrib>Velmurugan, Karthik Raja</creatorcontrib><creatorcontrib>Tae, Hongseok</creatorcontrib><creatorcontrib>Vaksman, Zalman</creatorcontrib><creatorcontrib>McIver, Lauren J.</creatorcontrib><creatorcontrib>Garner, Harold R.</creatorcontrib><title>Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome</title><title>Scientific reports</title><addtitle>Sci Rep</addtitle><addtitle>Sci Rep</addtitle><description>The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate its utility in identifying contigs from reads that do not map to the genome reference. The analysis of these samples yielded 790 novel extra-referential concordant contigs that are observed in more than one sample. We searched for evidence of functional elements in the concordant contigs in two ways: (1) BLAST-ing each contig against normal RNA-Seq samples, (2) Checking for predicted functional elements using GlimmerHMM. Of the 790 concordant contigs, 37 had an exact match to at least one RNA-Seq read; 15 aligned to more than 100 RNA-Seq reads. Of the 249 concordant contigs predicted by GlimmerHMM to have functional elements, 6 had at least one exact RNA-Seq match. BLAST-ing these novel contigs against all publically available sequences confirmed that they were found in human and chimpanzee BAC and FOSMID clones sequenced as part of the original human genome project. These extra-referential contigs predominantly contained pentameric repeats, especially two motifs: AATGG and GTGGA.</description><subject>631/114/2785/2302</subject><subject>631/208/212/2302</subject><subject>Algorithms</subject><subject>Animals</subject><subject>Cell Line</subject><subject>Contig Mapping</subject><subject>Design</subject><subject>Genome, Human</subject><subject>Genomes</subject><subject>Genomics</subject><subject>Humanities and Social Sciences</subject><subject>Humans</subject><subject>Microsatellite Repeats</subject><subject>multidisciplinary</subject><subject>Pan troglodytes - genetics</subject><subject>Science</subject><subject>Sequence Analysis, DNA - methods</subject><subject>Sequence Analysis, RNA - methods</subject><issn>2045-2322</issn><issn>2045-2322</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>EIF</sourceid><sourceid>BENPR</sourceid><recordid>eNplkVFrFDEUhYMottQ--Ack4IuKo0kmmSQ-CFLaKhR80eeQmbnZTZlJ1iRT2H9vhq3LqnlJuOfLuZd7EHpJyQdKWvUxJ9gxKRl7gs4Z4aJhLWNPT95n6DLne1KPYJpT_RydMcmk6jp9jva3EOLsBzyBK_EBUv6E_QiheLf3YYNDrU24AilmW2CafIH8Hq9kUxsnyJWFEc-x_sjYhhG7JQzFx2AnDBPMVc_YB1y2gLfLbAPerC3hBXrm7JTh8vG-QD9vrn9cfW3uvt9-u_py1wy8VaXpQIxKaNlxPsDQMWuJ7Ylm1kHb8V72SjDFidO0ZZxIB1ZYN7AOrOvpOEB7gT4ffHdLP0OthJLsZHbJzzbtTbTe_K0EvzWb-GC40lpRWg3ePBqk-GuBXMzs81BXYQPEJRsqtVBrALyir_9B7-OS6iYOlGgFkaJSbw_UutQanzsOQ4lZMzXHTCv76nT6I_knwQq8OwC5SmED6aTlf26_Abg6rwc</recordid><startdate>20160609</startdate><enddate>20160609</enddate><creator>Fonville, Natalie C.</creator><creator>Velmurugan, Karthik Raja</creator><creator>Tae, Hongseok</creator><creator>Vaksman, Zalman</creator><creator>McIver, Lauren J.</creator><creator>Garner, Harold R.</creator><general>Nature Publishing Group UK</general><general>Nature Publishing Group</general><scope>C6C</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>M7P</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20160609</creationdate><title>Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome</title><author>Fonville, Natalie C. ; Velmurugan, Karthik Raja ; Tae, Hongseok ; Vaksman, Zalman ; McIver, Lauren J. ; Garner, Harold R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c438t-6e5d8597644cec62aa0ab092afe364b7b852840f9132407fea5afc26eafb1dce3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>631/114/2785/2302</topic><topic>631/208/212/2302</topic><topic>Algorithms</topic><topic>Animals</topic><topic>Cell Line</topic><topic>Contig Mapping</topic><topic>Design</topic><topic>Genome, Human</topic><topic>Genomes</topic><topic>Genomics</topic><topic>Humanities and Social Sciences</topic><topic>Humans</topic><topic>Microsatellite Repeats</topic><topic>multidisciplinary</topic><topic>Pan troglodytes - genetics</topic><topic>Science</topic><topic>Sequence Analysis, DNA - methods</topic><topic>Sequence Analysis, RNA - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fonville, Natalie C.</creatorcontrib><creatorcontrib>Velmurugan, Karthik Raja</creatorcontrib><creatorcontrib>Tae, Hongseok</creatorcontrib><creatorcontrib>Vaksman, Zalman</creatorcontrib><creatorcontrib>McIver, Lauren J.</creatorcontrib><creatorcontrib>Garner, Harold R.</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Science Database</collection><collection>Biological Science Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Scientific reports</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fonville, Natalie C.</au><au>Velmurugan, Karthik Raja</au><au>Tae, Hongseok</au><au>Vaksman, Zalman</au><au>McIver, Lauren J.</au><au>Garner, Harold R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome</atitle><jtitle>Scientific reports</jtitle><stitle>Sci Rep</stitle><addtitle>Sci Rep</addtitle><date>2016-06-09</date><risdate>2016</risdate><volume>6</volume><issue>1</issue><spage>27722</spage><epage>27722</epage><pages>27722-27722</pages><artnum>27722</artnum><issn>2045-2322</issn><eissn>2045-2322</eissn><abstract>The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate its utility in identifying contigs from reads that do not map to the genome reference. The analysis of these samples yielded 790 novel extra-referential concordant contigs that are observed in more than one sample. We searched for evidence of functional elements in the concordant contigs in two ways: (1) BLAST-ing each contig against normal RNA-Seq samples, (2) Checking for predicted functional elements using GlimmerHMM. Of the 790 concordant contigs, 37 had an exact match to at least one RNA-Seq read; 15 aligned to more than 100 RNA-Seq reads. Of the 249 concordant contigs predicted by GlimmerHMM to have functional elements, 6 had at least one exact RNA-Seq match. BLAST-ing these novel contigs against all publically available sequences confirmed that they were found in human and chimpanzee BAC and FOSMID clones sequenced as part of the original human genome project. These extra-referential contigs predominantly contained pentameric repeats, especially two motifs: AATGG and GTGGA.</abstract><cop>London</cop><pub>Nature Publishing Group UK</pub><pmid>27278669</pmid><doi>10.1038/srep27722</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2045-2322 |
ispartof | Scientific reports, 2016-06, Vol.6 (1), p.27722-27722, Article 27722 |
issn | 2045-2322 2045-2322 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4899811 |
source | MEDLINE; Nature Free; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals; PubMed Central; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry; Springer Nature OA Free Journals |
subjects | 631/114/2785/2302 631/208/212/2302 Algorithms Animals Cell Line Contig Mapping Design Genome, Human Genomes Genomics Humanities and Social Sciences Humans Microsatellite Repeats multidisciplinary Pan troglodytes - genetics Science Sequence Analysis, DNA - methods Sequence Analysis, RNA - methods |
title | Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T00%3A18%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Genomic%20leftovers:%20identifying%20novel%20microsatellites,%20over-represented%20motifs%20and%20functional%20elements%20in%20the%20human%20genome&rft.jtitle=Scientific%20reports&rft.au=Fonville,%20Natalie%20C.&rft.date=2016-06-09&rft.volume=6&rft.issue=1&rft.spage=27722&rft.epage=27722&rft.pages=27722-27722&rft.artnum=27722&rft.issn=2045-2322&rft.eissn=2045-2322&rft_id=info:doi/10.1038/srep27722&rft_dat=%3Cproquest_pubme%3E4084347441%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1795535075&rft_id=info:pmid/27278669&rfr_iscdi=true |