Human copy number variants are enriched in regions of low mappability

Abstract Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nucleic acids research 2018-08, Vol.46 (14), p.7236-7249
Hauptverfasser: Monlong, Jean, Cossette, Patrick, Meloche, Caroline, Rouleau, Guy, Girard, Simon L, Bourque, Guillaume
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 7249
container_issue 14
container_start_page 7236
container_title Nucleic acids research
container_volume 46
creator Monlong, Jean
Cossette, Patrick
Meloche, Caroline
Rouleau, Guy
Girard, Simon L
Bourque, Guillaume
description Abstract Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low mappability. To address this, we use PopSV, a CNV caller that relies on multiple samples to control for technical variation. We demonstrate that our calls are stable across different types of repeat-rich regions and validate the accuracy of our predictions using orthogonal approaches. Applying PopSV to 640 human genomes, we find that low-mappability regions are approximately 5 times more likely to harbor germline CNVs, in stark contrast to the nearly uniform distribution observed for somatic CNVs in 95 cancer genomes. In addition to known enrichments in segmental duplication and near centromeres and telomeres, we also report that CNVs are enriched in specific types of satellite and in some of the most recent families of transposable elements. Finally, using this comprehensive approach, we identify 3455 regions with recurrent CNVs that were missing from existing catalogs. In particular, we identify 347 genes with a novel exonic CNV in low-mappability regions, including 29 genes previously associated with disease.
doi_str_mv 10.1093/nar/gky538
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6101599</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/nar/gky538</oup_id><sourcerecordid>2092530268</sourcerecordid><originalsourceid>FETCH-LOGICAL-c442t-90e6da42c483f72602d87b8fe60768bd735a5b63880fea5f6bd786dfa6f8efc53</originalsourceid><addsrcrecordid>eNp9kUtLxDAYRYMoOj42_gDJRlChmkeTphtBRB1hwI2uQ9omM9E2qUk7Mv_eSsdBXbj64H4nJyQXgGOMLjHK6ZVT4Wr-tmJUbIEJppwkac7JNpggiliCUSr2wH6MrwjhFLN0F-xRhGnGKZmAu2nfKAdL366g65tCB7hUwSrXRaiChtoFWy50Ba2DQc-tdxF6A2v_ARvVtqqwte1Wh2DHqDrqo_U8AC_3d8-302T29PB4ezNLyjQlXZIjzSuVkjIV1GSEI1KJrBBGc5RxUVQZZYoVnAqBjFbM8CESvDKKG6FNyegBuB69bV80uiq164KqZRtso8JKemXl742zCzn3S8kxwizPB8H5KFj8OTa9mcmvbPgsTjKWLfHAnq0vC_6917GTjY2lrmvltO-jJCgnjCLCxYBejGgZfIxBm40bI_nVkRw6kmNHA3zy8xEb9LuUATgdAd-3_4k-AVkFmxc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2092530268</pqid></control><display><type>article</type><title>Human copy number variants are enriched in regions of low mappability</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Oxford Journals Open Access Collection</source><source>PubMed Central</source><source>Free Full-Text Journals in Chemistry</source><creator>Monlong, Jean ; Cossette, Patrick ; Meloche, Caroline ; Rouleau, Guy ; Girard, Simon L ; Bourque, Guillaume</creator><creatorcontrib>Monlong, Jean ; Cossette, Patrick ; Meloche, Caroline ; Rouleau, Guy ; Girard, Simon L ; Bourque, Guillaume</creatorcontrib><description>Abstract Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low mappability. To address this, we use PopSV, a CNV caller that relies on multiple samples to control for technical variation. We demonstrate that our calls are stable across different types of repeat-rich regions and validate the accuracy of our predictions using orthogonal approaches. Applying PopSV to 640 human genomes, we find that low-mappability regions are approximately 5 times more likely to harbor germline CNVs, in stark contrast to the nearly uniform distribution observed for somatic CNVs in 95 cancer genomes. In addition to known enrichments in segmental duplication and near centromeres and telomeres, we also report that CNVs are enriched in specific types of satellite and in some of the most recent families of transposable elements. Finally, using this comprehensive approach, we identify 3455 regions with recurrent CNVs that were missing from existing catalogs. In particular, we identify 347 genes with a novel exonic CNV in low-mappability regions, including 29 genes previously associated with disease.</description><identifier>ISSN: 0305-1048</identifier><identifier>EISSN: 1362-4962</identifier><identifier>DOI: 10.1093/nar/gky538</identifier><identifier>PMID: 30137632</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Centromere - genetics ; Chromosome Mapping - methods ; DNA Copy Number Variations ; Genome, Human - genetics ; Genomics ; Genomics - methods ; Humans ; Life Sciences ; Neoplasms - genetics ; Neoplasms - pathology ; Polymorphism, Single Nucleotide ; Repetitive Sequences, Nucleic Acid - genetics ; Reproducibility of Results ; Telomere - genetics ; Whole Genome Sequencing - methods</subject><ispartof>Nucleic acids research, 2018-08, Vol.46 (14), p.7236-7249</ispartof><rights>The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research. 2018</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c442t-90e6da42c483f72602d87b8fe60768bd735a5b63880fea5f6bd786dfa6f8efc53</citedby><cites>FETCH-LOGICAL-c442t-90e6da42c483f72602d87b8fe60768bd735a5b63880fea5f6bd786dfa6f8efc53</cites><orcidid>0000-0002-9737-5516</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6101599/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6101599/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,1603,27923,27924,53790,53792</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30137632$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://hal.science/hal-04862757$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Monlong, Jean</creatorcontrib><creatorcontrib>Cossette, Patrick</creatorcontrib><creatorcontrib>Meloche, Caroline</creatorcontrib><creatorcontrib>Rouleau, Guy</creatorcontrib><creatorcontrib>Girard, Simon L</creatorcontrib><creatorcontrib>Bourque, Guillaume</creatorcontrib><title>Human copy number variants are enriched in regions of low mappability</title><title>Nucleic acids research</title><addtitle>Nucleic Acids Res</addtitle><description>Abstract Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low mappability. To address this, we use PopSV, a CNV caller that relies on multiple samples to control for technical variation. We demonstrate that our calls are stable across different types of repeat-rich regions and validate the accuracy of our predictions using orthogonal approaches. Applying PopSV to 640 human genomes, we find that low-mappability regions are approximately 5 times more likely to harbor germline CNVs, in stark contrast to the nearly uniform distribution observed for somatic CNVs in 95 cancer genomes. In addition to known enrichments in segmental duplication and near centromeres and telomeres, we also report that CNVs are enriched in specific types of satellite and in some of the most recent families of transposable elements. Finally, using this comprehensive approach, we identify 3455 regions with recurrent CNVs that were missing from existing catalogs. In particular, we identify 347 genes with a novel exonic CNV in low-mappability regions, including 29 genes previously associated with disease.</description><subject>Centromere - genetics</subject><subject>Chromosome Mapping - methods</subject><subject>DNA Copy Number Variations</subject><subject>Genome, Human - genetics</subject><subject>Genomics</subject><subject>Genomics - methods</subject><subject>Humans</subject><subject>Life Sciences</subject><subject>Neoplasms - genetics</subject><subject>Neoplasms - pathology</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Repetitive Sequences, Nucleic Acid - genetics</subject><subject>Reproducibility of Results</subject><subject>Telomere - genetics</subject><subject>Whole Genome Sequencing - methods</subject><issn>0305-1048</issn><issn>1362-4962</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><sourceid>EIF</sourceid><recordid>eNp9kUtLxDAYRYMoOj42_gDJRlChmkeTphtBRB1hwI2uQ9omM9E2qUk7Mv_eSsdBXbj64H4nJyQXgGOMLjHK6ZVT4Wr-tmJUbIEJppwkac7JNpggiliCUSr2wH6MrwjhFLN0F-xRhGnGKZmAu2nfKAdL366g65tCB7hUwSrXRaiChtoFWy50Ba2DQc-tdxF6A2v_ARvVtqqwte1Wh2DHqDrqo_U8AC_3d8-302T29PB4ezNLyjQlXZIjzSuVkjIV1GSEI1KJrBBGc5RxUVQZZYoVnAqBjFbM8CESvDKKG6FNyegBuB69bV80uiq164KqZRtso8JKemXl742zCzn3S8kxwizPB8H5KFj8OTa9mcmvbPgsTjKWLfHAnq0vC_6917GTjY2lrmvltO-jJCgnjCLCxYBejGgZfIxBm40bI_nVkRw6kmNHA3zy8xEb9LuUATgdAd-3_4k-AVkFmxc</recordid><startdate>20180821</startdate><enddate>20180821</enddate><creator>Monlong, Jean</creator><creator>Cossette, Patrick</creator><creator>Meloche, Caroline</creator><creator>Rouleau, Guy</creator><creator>Girard, Simon L</creator><creator>Bourque, Guillaume</creator><general>Oxford University Press</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>1XC</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-9737-5516</orcidid></search><sort><creationdate>20180821</creationdate><title>Human copy number variants are enriched in regions of low mappability</title><author>Monlong, Jean ; Cossette, Patrick ; Meloche, Caroline ; Rouleau, Guy ; Girard, Simon L ; Bourque, Guillaume</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c442t-90e6da42c483f72602d87b8fe60768bd735a5b63880fea5f6bd786dfa6f8efc53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Centromere - genetics</topic><topic>Chromosome Mapping - methods</topic><topic>DNA Copy Number Variations</topic><topic>Genome, Human - genetics</topic><topic>Genomics</topic><topic>Genomics - methods</topic><topic>Humans</topic><topic>Life Sciences</topic><topic>Neoplasms - genetics</topic><topic>Neoplasms - pathology</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Repetitive Sequences, Nucleic Acid - genetics</topic><topic>Reproducibility of Results</topic><topic>Telomere - genetics</topic><topic>Whole Genome Sequencing - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Monlong, Jean</creatorcontrib><creatorcontrib>Cossette, Patrick</creatorcontrib><creatorcontrib>Meloche, Caroline</creatorcontrib><creatorcontrib>Rouleau, Guy</creatorcontrib><creatorcontrib>Girard, Simon L</creatorcontrib><creatorcontrib>Bourque, Guillaume</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Nucleic acids research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Monlong, Jean</au><au>Cossette, Patrick</au><au>Meloche, Caroline</au><au>Rouleau, Guy</au><au>Girard, Simon L</au><au>Bourque, Guillaume</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Human copy number variants are enriched in regions of low mappability</atitle><jtitle>Nucleic acids research</jtitle><addtitle>Nucleic Acids Res</addtitle><date>2018-08-21</date><risdate>2018</risdate><volume>46</volume><issue>14</issue><spage>7236</spage><epage>7249</epage><pages>7236-7249</pages><issn>0305-1048</issn><eissn>1362-4962</eissn><abstract>Abstract Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low mappability. To address this, we use PopSV, a CNV caller that relies on multiple samples to control for technical variation. We demonstrate that our calls are stable across different types of repeat-rich regions and validate the accuracy of our predictions using orthogonal approaches. Applying PopSV to 640 human genomes, we find that low-mappability regions are approximately 5 times more likely to harbor germline CNVs, in stark contrast to the nearly uniform distribution observed for somatic CNVs in 95 cancer genomes. In addition to known enrichments in segmental duplication and near centromeres and telomeres, we also report that CNVs are enriched in specific types of satellite and in some of the most recent families of transposable elements. Finally, using this comprehensive approach, we identify 3455 regions with recurrent CNVs that were missing from existing catalogs. In particular, we identify 347 genes with a novel exonic CNV in low-mappability regions, including 29 genes previously associated with disease.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>30137632</pmid><doi>10.1093/nar/gky538</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-9737-5516</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0305-1048
ispartof Nucleic acids research, 2018-08, Vol.46 (14), p.7236-7249
issn 0305-1048
1362-4962
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6101599
source MEDLINE; DOAJ Directory of Open Access Journals; Oxford Journals Open Access Collection; PubMed Central; Free Full-Text Journals in Chemistry
subjects Centromere - genetics
Chromosome Mapping - methods
DNA Copy Number Variations
Genome, Human - genetics
Genomics
Genomics - methods
Humans
Life Sciences
Neoplasms - genetics
Neoplasms - pathology
Polymorphism, Single Nucleotide
Repetitive Sequences, Nucleic Acid - genetics
Reproducibility of Results
Telomere - genetics
Whole Genome Sequencing - methods
title Human copy number variants are enriched in regions of low mappability
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T15%3A46%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Human%20copy%20number%20variants%20are%20enriched%20in%20regions%20of%20low%20mappability&rft.jtitle=Nucleic%20acids%20research&rft.au=Monlong,%20Jean&rft.date=2018-08-21&rft.volume=46&rft.issue=14&rft.spage=7236&rft.epage=7249&rft.pages=7236-7249&rft.issn=0305-1048&rft.eissn=1362-4962&rft_id=info:doi/10.1093/nar/gky538&rft_dat=%3Cproquest_pubme%3E2092530268%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2092530268&rft_id=info:pmid/30137632&rft_oup_id=10.1093/nar/gky538&rfr_iscdi=true