A regression model for estimating DNA copy number applied to capture sequencing data

Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2012-09, Vol.28 (18), p.2357-2365
Hauptverfasser: RIGAILL, Guillem J, CADOT, Sidney, KLUIN, Roelof J. C, ZHENG XUE, BERNARDS, Rene, MAJEWSKI, Ian J, WESSELS, Lodewyk F. A
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2365
container_issue 18
container_start_page 2357
container_title Bioinformatics
container_volume 28
creator RIGAILL, Guillem J
CADOT, Sidney
KLUIN, Roelof J. C
ZHENG XUE
BERNARDS, Rene
MAJEWSKI, Ian J
WESSELS, Lodewyk F. A
description Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus. We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data. The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/ l.wessels@nki.nl Supplementary data are available at Bioinformatics online.
doi_str_mv 10.1093/bioinformatics/bts448
format Article
fullrecord <record><control><sourceid>proquest_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_02648001v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1434023437</sourcerecordid><originalsourceid>FETCH-LOGICAL-c453t-6d8a43465f6fe12229e085dd2d6205c7edd513cefb20ef3aa51edd6bb412ff43</originalsourceid><addsrcrecordid>eNpVkMtKAzEUhoMo3h9ByUbQRW3uM10W71B0033IJCd1ZGYyJjOCb29Ka8VVwuE7f_58CF1QckvJjE-rOtSdD7E1Q23TtBqSEOUeOqZcFRNRUrq_uxN-hE5S-iCESCLVITpirJipmSyP0XKOI6wipFSHDrfBQYNzKoY01OvoboXvX-fYhv4bd2NbQcSm75saHB4CtqYfxgg4wecInV3TzgzmDB140yQ4356naPn4sLx7nizenl7u5ouJFZIPE-VKI7hQ0isPlDE2A1JK55hTjEhbgHOScgu-YgQ8N0bSPFJVJSjzXvBTdLOJfTeN7mPuG791MLV-ni_0ekaYyr8n9Itm9nrD9jHkrmnQbZ0sNI3pIIxJ01yEMC54kVG5QW0MKUXwu2xK9Nq9_u9eb9znvcvtE2PVgttt_crOwNUWMMmaxkeTlaU_TnGuZFHyH0ACkqA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1434023437</pqid></control><display><type>article</type><title>A regression model for estimating DNA copy number applied to capture sequencing data</title><source>PubMed (Medline)</source><source>Oxford Journals Open Access Collection</source><source>MEDLINE</source><source>Alma/SFX Local Collection</source><source>EZB Electronic Journals Library</source><creator>RIGAILL, Guillem J ; CADOT, Sidney ; KLUIN, Roelof J. C ; ZHENG XUE ; BERNARDS, Rene ; MAJEWSKI, Ian J ; WESSELS, Lodewyk F. A</creator><creatorcontrib>RIGAILL, Guillem J ; CADOT, Sidney ; KLUIN, Roelof J. C ; ZHENG XUE ; BERNARDS, Rene ; MAJEWSKI, Ian J ; WESSELS, Lodewyk F. A</creatorcontrib><description>Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus. We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data. The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/ l.wessels@nki.nl Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>EISSN: 1460-2059</identifier><identifier>DOI: 10.1093/bioinformatics/bts448</identifier><identifier>PMID: 22796958</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Algorithms ; Biological and medical sciences ; Breast Neoplasms - genetics ; Cell Line, Tumor ; DNA Copy Number Variations ; Female ; Fundamental and applied biological sciences. Psychology ; General aspects ; Genomics - methods ; Genotype ; Humans ; Life Sciences ; Linear Models ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Polymorphism, Single Nucleotide ; Sequence Analysis, DNA ; Vegetal Biology</subject><ispartof>Bioinformatics, 2012-09, Vol.28 (18), p.2357-2365</ispartof><rights>2015 INIST-CNRS</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c453t-6d8a43465f6fe12229e085dd2d6205c7edd513cefb20ef3aa51edd6bb412ff43</citedby><cites>FETCH-LOGICAL-c453t-6d8a43465f6fe12229e085dd2d6205c7edd513cefb20ef3aa51edd6bb412ff43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,881,27901,27902</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=26336578$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/22796958$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://hal.inrae.fr/hal-02648001$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>RIGAILL, Guillem J</creatorcontrib><creatorcontrib>CADOT, Sidney</creatorcontrib><creatorcontrib>KLUIN, Roelof J. C</creatorcontrib><creatorcontrib>ZHENG XUE</creatorcontrib><creatorcontrib>BERNARDS, Rene</creatorcontrib><creatorcontrib>MAJEWSKI, Ian J</creatorcontrib><creatorcontrib>WESSELS, Lodewyk F. A</creatorcontrib><title>A regression model for estimating DNA copy number applied to capture sequencing data</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus. We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data. The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/ l.wessels@nki.nl Supplementary data are available at Bioinformatics online.</description><subject>Algorithms</subject><subject>Biological and medical sciences</subject><subject>Breast Neoplasms - genetics</subject><subject>Cell Line, Tumor</subject><subject>DNA Copy Number Variations</subject><subject>Female</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Genomics - methods</subject><subject>Genotype</subject><subject>Humans</subject><subject>Life Sciences</subject><subject>Linear Models</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Sequence Analysis, DNA</subject><subject>Vegetal Biology</subject><issn>1367-4803</issn><issn>1367-4811</issn><issn>1460-2059</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVkMtKAzEUhoMo3h9ByUbQRW3uM10W71B0033IJCd1ZGYyJjOCb29Ka8VVwuE7f_58CF1QckvJjE-rOtSdD7E1Q23TtBqSEOUeOqZcFRNRUrq_uxN-hE5S-iCESCLVITpirJipmSyP0XKOI6wipFSHDrfBQYNzKoY01OvoboXvX-fYhv4bd2NbQcSm75saHB4CtqYfxgg4wecInV3TzgzmDB140yQ4356naPn4sLx7nizenl7u5ouJFZIPE-VKI7hQ0isPlDE2A1JK55hTjEhbgHOScgu-YgQ8N0bSPFJVJSjzXvBTdLOJfTeN7mPuG791MLV-ni_0ekaYyr8n9Itm9nrD9jHkrmnQbZ0sNI3pIIxJ01yEMC54kVG5QW0MKUXwu2xK9Nq9_u9eb9znvcvtE2PVgttt_crOwNUWMMmaxkeTlaU_TnGuZFHyH0ACkqA</recordid><startdate>20120915</startdate><enddate>20120915</enddate><creator>RIGAILL, Guillem J</creator><creator>CADOT, Sidney</creator><creator>KLUIN, Roelof J. C</creator><creator>ZHENG XUE</creator><creator>BERNARDS, Rene</creator><creator>MAJEWSKI, Ian J</creator><creator>WESSELS, Lodewyk F. A</creator><general>Oxford University Press</general><general>Oxford University Press (OUP)</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7TM</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>1XC</scope></search><sort><creationdate>20120915</creationdate><title>A regression model for estimating DNA copy number applied to capture sequencing data</title><author>RIGAILL, Guillem J ; CADOT, Sidney ; KLUIN, Roelof J. C ; ZHENG XUE ; BERNARDS, Rene ; MAJEWSKI, Ian J ; WESSELS, Lodewyk F. A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c453t-6d8a43465f6fe12229e085dd2d6205c7edd513cefb20ef3aa51edd6bb412ff43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Algorithms</topic><topic>Biological and medical sciences</topic><topic>Breast Neoplasms - genetics</topic><topic>Cell Line, Tumor</topic><topic>DNA Copy Number Variations</topic><topic>Female</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Genomics - methods</topic><topic>Genotype</topic><topic>Humans</topic><topic>Life Sciences</topic><topic>Linear Models</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Sequence Analysis, DNA</topic><topic>Vegetal Biology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>RIGAILL, Guillem J</creatorcontrib><creatorcontrib>CADOT, Sidney</creatorcontrib><creatorcontrib>KLUIN, Roelof J. C</creatorcontrib><creatorcontrib>ZHENG XUE</creatorcontrib><creatorcontrib>BERNARDS, Rene</creatorcontrib><creatorcontrib>MAJEWSKI, Ian J</creatorcontrib><creatorcontrib>WESSELS, Lodewyk F. A</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Hyper Article en Ligne (HAL)</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>RIGAILL, Guillem J</au><au>CADOT, Sidney</au><au>KLUIN, Roelof J. C</au><au>ZHENG XUE</au><au>BERNARDS, Rene</au><au>MAJEWSKI, Ian J</au><au>WESSELS, Lodewyk F. A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A regression model for estimating DNA copy number applied to capture sequencing data</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2012-09-15</date><risdate>2012</risdate><volume>28</volume><issue>18</issue><spage>2357</spage><epage>2365</epage><pages>2357-2365</pages><issn>1367-4803</issn><eissn>1367-4811</eissn><eissn>1460-2059</eissn><abstract>Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus. We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data. The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/ l.wessels@nki.nl Supplementary data are available at Bioinformatics online.</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>22796958</pmid><doi>10.1093/bioinformatics/bts448</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4803
ispartof Bioinformatics, 2012-09, Vol.28 (18), p.2357-2365
issn 1367-4803
1367-4811
1460-2059
language eng
recordid cdi_hal_primary_oai_HAL_hal_02648001v1
source PubMed (Medline); Oxford Journals Open Access Collection; MEDLINE; Alma/SFX Local Collection; EZB Electronic Journals Library
subjects Algorithms
Biological and medical sciences
Breast Neoplasms - genetics
Cell Line, Tumor
DNA Copy Number Variations
Female
Fundamental and applied biological sciences. Psychology
General aspects
Genomics - methods
Genotype
Humans
Life Sciences
Linear Models
Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)
Polymorphism, Single Nucleotide
Sequence Analysis, DNA
Vegetal Biology
title A regression model for estimating DNA copy number applied to capture sequencing data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T14%3A16%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20regression%20model%20for%20estimating%20DNA%20copy%20number%20applied%20to%20capture%20sequencing%20data&rft.jtitle=Bioinformatics&rft.au=RIGAILL,%20Guillem%20J&rft.date=2012-09-15&rft.volume=28&rft.issue=18&rft.spage=2357&rft.epage=2365&rft.pages=2357-2365&rft.issn=1367-4803&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/bts448&rft_dat=%3Cproquest_hal_p%3E1434023437%3C/proquest_hal_p%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1434023437&rft_id=info:pmid/22796958&rfr_iscdi=true