Human copy number variants are enriched in regions of low mappability
Abstract Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low...
Gespeichert in:
Veröffentlicht in: | Nucleic acids research 2018-08, Vol.46 (14), p.7236-7249 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 7249 |
---|---|
container_issue | 14 |
container_start_page | 7236 |
container_title | Nucleic acids research |
container_volume | 46 |
creator | Monlong, Jean Cossette, Patrick Meloche, Caroline Rouleau, Guy Girard, Simon L Bourque, Guillaume |
description | Abstract
Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low mappability. To address this, we use PopSV, a CNV caller that relies on multiple samples to control for technical variation. We demonstrate that our calls are stable across different types of repeat-rich regions and validate the accuracy of our predictions using orthogonal approaches. Applying PopSV to 640 human genomes, we find that low-mappability regions are approximately 5 times more likely to harbor germline CNVs, in stark contrast to the nearly uniform distribution observed for somatic CNVs in 95 cancer genomes. In addition to known enrichments in segmental duplication and near centromeres and telomeres, we also report that CNVs are enriched in specific types of satellite and in some of the most recent families of transposable elements. Finally, using this comprehensive approach, we identify 3455 regions with recurrent CNVs that were missing from existing catalogs. In particular, we identify 347 genes with a novel exonic CNV in low-mappability regions, including 29 genes previously associated with disease. |
doi_str_mv | 10.1093/nar/gky538 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6101599</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/nar/gky538</oup_id><sourcerecordid>2092530268</sourcerecordid><originalsourceid>FETCH-LOGICAL-c442t-90e6da42c483f72602d87b8fe60768bd735a5b63880fea5f6bd786dfa6f8efc53</originalsourceid><addsrcrecordid>eNp9kUtLxDAYRYMoOj42_gDJRlChmkeTphtBRB1hwI2uQ9omM9E2qUk7Mv_eSsdBXbj64H4nJyQXgGOMLjHK6ZVT4Wr-tmJUbIEJppwkac7JNpggiliCUSr2wH6MrwjhFLN0F-xRhGnGKZmAu2nfKAdL366g65tCB7hUwSrXRaiChtoFWy50Ba2DQc-tdxF6A2v_ARvVtqqwte1Wh2DHqDrqo_U8AC_3d8-302T29PB4ezNLyjQlXZIjzSuVkjIV1GSEI1KJrBBGc5RxUVQZZYoVnAqBjFbM8CESvDKKG6FNyegBuB69bV80uiq164KqZRtso8JKemXl742zCzn3S8kxwizPB8H5KFj8OTa9mcmvbPgsTjKWLfHAnq0vC_6917GTjY2lrmvltO-jJCgnjCLCxYBejGgZfIxBm40bI_nVkRw6kmNHA3zy8xEb9LuUATgdAd-3_4k-AVkFmxc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2092530268</pqid></control><display><type>article</type><title>Human copy number variants are enriched in regions of low mappability</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Oxford Journals Open Access Collection</source><source>PubMed Central</source><source>Free Full-Text Journals in Chemistry</source><creator>Monlong, Jean ; Cossette, Patrick ; Meloche, Caroline ; Rouleau, Guy ; Girard, Simon L ; Bourque, Guillaume</creator><creatorcontrib>Monlong, Jean ; Cossette, Patrick ; Meloche, Caroline ; Rouleau, Guy ; Girard, Simon L ; Bourque, Guillaume</creatorcontrib><description>Abstract
Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low mappability. To address this, we use PopSV, a CNV caller that relies on multiple samples to control for technical variation. We demonstrate that our calls are stable across different types of repeat-rich regions and validate the accuracy of our predictions using orthogonal approaches. Applying PopSV to 640 human genomes, we find that low-mappability regions are approximately 5 times more likely to harbor germline CNVs, in stark contrast to the nearly uniform distribution observed for somatic CNVs in 95 cancer genomes. In addition to known enrichments in segmental duplication and near centromeres and telomeres, we also report that CNVs are enriched in specific types of satellite and in some of the most recent families of transposable elements. Finally, using this comprehensive approach, we identify 3455 regions with recurrent CNVs that were missing from existing catalogs. In particular, we identify 347 genes with a novel exonic CNV in low-mappability regions, including 29 genes previously associated with disease.</description><identifier>ISSN: 0305-1048</identifier><identifier>EISSN: 1362-4962</identifier><identifier>DOI: 10.1093/nar/gky538</identifier><identifier>PMID: 30137632</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Centromere - genetics ; Chromosome Mapping - methods ; DNA Copy Number Variations ; Genome, Human - genetics ; Genomics ; Genomics - methods ; Humans ; Life Sciences ; Neoplasms - genetics ; Neoplasms - pathology ; Polymorphism, Single Nucleotide ; Repetitive Sequences, Nucleic Acid - genetics ; Reproducibility of Results ; Telomere - genetics ; Whole Genome Sequencing - methods</subject><ispartof>Nucleic acids research, 2018-08, Vol.46 (14), p.7236-7249</ispartof><rights>The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research. 2018</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c442t-90e6da42c483f72602d87b8fe60768bd735a5b63880fea5f6bd786dfa6f8efc53</citedby><cites>FETCH-LOGICAL-c442t-90e6da42c483f72602d87b8fe60768bd735a5b63880fea5f6bd786dfa6f8efc53</cites><orcidid>0000-0002-9737-5516</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6101599/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6101599/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,1603,27923,27924,53790,53792</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30137632$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://hal.science/hal-04862757$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Monlong, Jean</creatorcontrib><creatorcontrib>Cossette, Patrick</creatorcontrib><creatorcontrib>Meloche, Caroline</creatorcontrib><creatorcontrib>Rouleau, Guy</creatorcontrib><creatorcontrib>Girard, Simon L</creatorcontrib><creatorcontrib>Bourque, Guillaume</creatorcontrib><title>Human copy number variants are enriched in regions of low mappability</title><title>Nucleic acids research</title><addtitle>Nucleic Acids Res</addtitle><description>Abstract
Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low mappability. To address this, we use PopSV, a CNV caller that relies on multiple samples to control for technical variation. We demonstrate that our calls are stable across different types of repeat-rich regions and validate the accuracy of our predictions using orthogonal approaches. Applying PopSV to 640 human genomes, we find that low-mappability regions are approximately 5 times more likely to harbor germline CNVs, in stark contrast to the nearly uniform distribution observed for somatic CNVs in 95 cancer genomes. In addition to known enrichments in segmental duplication and near centromeres and telomeres, we also report that CNVs are enriched in specific types of satellite and in some of the most recent families of transposable elements. Finally, using this comprehensive approach, we identify 3455 regions with recurrent CNVs that were missing from existing catalogs. In particular, we identify 347 genes with a novel exonic CNV in low-mappability regions, including 29 genes previously associated with disease.</description><subject>Centromere - genetics</subject><subject>Chromosome Mapping - methods</subject><subject>DNA Copy Number Variations</subject><subject>Genome, Human - genetics</subject><subject>Genomics</subject><subject>Genomics - methods</subject><subject>Humans</subject><subject>Life Sciences</subject><subject>Neoplasms - genetics</subject><subject>Neoplasms - pathology</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Repetitive Sequences, Nucleic Acid - genetics</subject><subject>Reproducibility of Results</subject><subject>Telomere - genetics</subject><subject>Whole Genome Sequencing - methods</subject><issn>0305-1048</issn><issn>1362-4962</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><sourceid>EIF</sourceid><recordid>eNp9kUtLxDAYRYMoOj42_gDJRlChmkeTphtBRB1hwI2uQ9omM9E2qUk7Mv_eSsdBXbj64H4nJyQXgGOMLjHK6ZVT4Wr-tmJUbIEJppwkac7JNpggiliCUSr2wH6MrwjhFLN0F-xRhGnGKZmAu2nfKAdL366g65tCB7hUwSrXRaiChtoFWy50Ba2DQc-tdxF6A2v_ARvVtqqwte1Wh2DHqDrqo_U8AC_3d8-302T29PB4ezNLyjQlXZIjzSuVkjIV1GSEI1KJrBBGc5RxUVQZZYoVnAqBjFbM8CESvDKKG6FNyegBuB69bV80uiq164KqZRtso8JKemXl742zCzn3S8kxwizPB8H5KFj8OTa9mcmvbPgsTjKWLfHAnq0vC_6917GTjY2lrmvltO-jJCgnjCLCxYBejGgZfIxBm40bI_nVkRw6kmNHA3zy8xEb9LuUATgdAd-3_4k-AVkFmxc</recordid><startdate>20180821</startdate><enddate>20180821</enddate><creator>Monlong, Jean</creator><creator>Cossette, Patrick</creator><creator>Meloche, Caroline</creator><creator>Rouleau, Guy</creator><creator>Girard, Simon L</creator><creator>Bourque, Guillaume</creator><general>Oxford University Press</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>1XC</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-9737-5516</orcidid></search><sort><creationdate>20180821</creationdate><title>Human copy number variants are enriched in regions of low mappability</title><author>Monlong, Jean ; Cossette, Patrick ; Meloche, Caroline ; Rouleau, Guy ; Girard, Simon L ; Bourque, Guillaume</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c442t-90e6da42c483f72602d87b8fe60768bd735a5b63880fea5f6bd786dfa6f8efc53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Centromere - genetics</topic><topic>Chromosome Mapping - methods</topic><topic>DNA Copy Number Variations</topic><topic>Genome, Human - genetics</topic><topic>Genomics</topic><topic>Genomics - methods</topic><topic>Humans</topic><topic>Life Sciences</topic><topic>Neoplasms - genetics</topic><topic>Neoplasms - pathology</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Repetitive Sequences, Nucleic Acid - genetics</topic><topic>Reproducibility of Results</topic><topic>Telomere - genetics</topic><topic>Whole Genome Sequencing - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Monlong, Jean</creatorcontrib><creatorcontrib>Cossette, Patrick</creatorcontrib><creatorcontrib>Meloche, Caroline</creatorcontrib><creatorcontrib>Rouleau, Guy</creatorcontrib><creatorcontrib>Girard, Simon L</creatorcontrib><creatorcontrib>Bourque, Guillaume</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Nucleic acids research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Monlong, Jean</au><au>Cossette, Patrick</au><au>Meloche, Caroline</au><au>Rouleau, Guy</au><au>Girard, Simon L</au><au>Bourque, Guillaume</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Human copy number variants are enriched in regions of low mappability</atitle><jtitle>Nucleic acids research</jtitle><addtitle>Nucleic Acids Res</addtitle><date>2018-08-21</date><risdate>2018</risdate><volume>46</volume><issue>14</issue><spage>7236</spage><epage>7249</epage><pages>7236-7249</pages><issn>0305-1048</issn><eissn>1362-4962</eissn><abstract>Abstract
Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low mappability. To address this, we use PopSV, a CNV caller that relies on multiple samples to control for technical variation. We demonstrate that our calls are stable across different types of repeat-rich regions and validate the accuracy of our predictions using orthogonal approaches. Applying PopSV to 640 human genomes, we find that low-mappability regions are approximately 5 times more likely to harbor germline CNVs, in stark contrast to the nearly uniform distribution observed for somatic CNVs in 95 cancer genomes. In addition to known enrichments in segmental duplication and near centromeres and telomeres, we also report that CNVs are enriched in specific types of satellite and in some of the most recent families of transposable elements. Finally, using this comprehensive approach, we identify 3455 regions with recurrent CNVs that were missing from existing catalogs. In particular, we identify 347 genes with a novel exonic CNV in low-mappability regions, including 29 genes previously associated with disease.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>30137632</pmid><doi>10.1093/nar/gky538</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-9737-5516</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0305-1048 |
ispartof | Nucleic acids research, 2018-08, Vol.46 (14), p.7236-7249 |
issn | 0305-1048 1362-4962 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6101599 |
source | MEDLINE; DOAJ Directory of Open Access Journals; Oxford Journals Open Access Collection; PubMed Central; Free Full-Text Journals in Chemistry |
subjects | Centromere - genetics Chromosome Mapping - methods DNA Copy Number Variations Genome, Human - genetics Genomics Genomics - methods Humans Life Sciences Neoplasms - genetics Neoplasms - pathology Polymorphism, Single Nucleotide Repetitive Sequences, Nucleic Acid - genetics Reproducibility of Results Telomere - genetics Whole Genome Sequencing - methods |
title | Human copy number variants are enriched in regions of low mappability |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T15%3A46%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Human%20copy%20number%20variants%20are%20enriched%20in%20regions%20of%20low%20mappability&rft.jtitle=Nucleic%20acids%20research&rft.au=Monlong,%20Jean&rft.date=2018-08-21&rft.volume=46&rft.issue=14&rft.spage=7236&rft.epage=7249&rft.pages=7236-7249&rft.issn=0305-1048&rft.eissn=1362-4962&rft_id=info:doi/10.1093/nar/gky538&rft_dat=%3Cproquest_pubme%3E2092530268%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2092530268&rft_id=info:pmid/30137632&rft_oup_id=10.1093/nar/gky538&rfr_iscdi=true |