A regression model for estimating DNA copy number applied to capture sequencing data
Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the...
Gespeichert in:
Veröffentlicht in: | Bioinformatics 2012-09, Vol.28 (18), p.2357-2365 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2365 |
---|---|
container_issue | 18 |
container_start_page | 2357 |
container_title | Bioinformatics |
container_volume | 28 |
creator | RIGAILL, Guillem J CADOT, Sidney KLUIN, Roelof J. C ZHENG XUE BERNARDS, Rene MAJEWSKI, Ian J WESSELS, Lodewyk F. A |
description | Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus.
We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data.
The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/
l.wessels@nki.nl
Supplementary data are available at Bioinformatics online. |
doi_str_mv | 10.1093/bioinformatics/bts448 |
format | Article |
fullrecord | <record><control><sourceid>proquest_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_02648001v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1434023437</sourcerecordid><originalsourceid>FETCH-LOGICAL-c453t-6d8a43465f6fe12229e085dd2d6205c7edd513cefb20ef3aa51edd6bb412ff43</originalsourceid><addsrcrecordid>eNpVkMtKAzEUhoMo3h9ByUbQRW3uM10W71B0033IJCd1ZGYyJjOCb29Ka8VVwuE7f_58CF1QckvJjE-rOtSdD7E1Q23TtBqSEOUeOqZcFRNRUrq_uxN-hE5S-iCESCLVITpirJipmSyP0XKOI6wipFSHDrfBQYNzKoY01OvoboXvX-fYhv4bd2NbQcSm75saHB4CtqYfxgg4wecInV3TzgzmDB140yQ4356naPn4sLx7nizenl7u5ouJFZIPE-VKI7hQ0isPlDE2A1JK55hTjEhbgHOScgu-YgQ8N0bSPFJVJSjzXvBTdLOJfTeN7mPuG791MLV-ni_0ekaYyr8n9Itm9nrD9jHkrmnQbZ0sNI3pIIxJ01yEMC54kVG5QW0MKUXwu2xK9Nq9_u9eb9znvcvtE2PVgttt_crOwNUWMMmaxkeTlaU_TnGuZFHyH0ACkqA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1434023437</pqid></control><display><type>article</type><title>A regression model for estimating DNA copy number applied to capture sequencing data</title><source>PubMed (Medline)</source><source>Oxford Journals Open Access Collection</source><source>MEDLINE</source><source>Alma/SFX Local Collection</source><source>EZB Electronic Journals Library</source><creator>RIGAILL, Guillem J ; CADOT, Sidney ; KLUIN, Roelof J. C ; ZHENG XUE ; BERNARDS, Rene ; MAJEWSKI, Ian J ; WESSELS, Lodewyk F. A</creator><creatorcontrib>RIGAILL, Guillem J ; CADOT, Sidney ; KLUIN, Roelof J. C ; ZHENG XUE ; BERNARDS, Rene ; MAJEWSKI, Ian J ; WESSELS, Lodewyk F. A</creatorcontrib><description>Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus.
We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data.
The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/
l.wessels@nki.nl
Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>EISSN: 1460-2059</identifier><identifier>DOI: 10.1093/bioinformatics/bts448</identifier><identifier>PMID: 22796958</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Algorithms ; Biological and medical sciences ; Breast Neoplasms - genetics ; Cell Line, Tumor ; DNA Copy Number Variations ; Female ; Fundamental and applied biological sciences. Psychology ; General aspects ; Genomics - methods ; Genotype ; Humans ; Life Sciences ; Linear Models ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Polymorphism, Single Nucleotide ; Sequence Analysis, DNA ; Vegetal Biology</subject><ispartof>Bioinformatics, 2012-09, Vol.28 (18), p.2357-2365</ispartof><rights>2015 INIST-CNRS</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c453t-6d8a43465f6fe12229e085dd2d6205c7edd513cefb20ef3aa51edd6bb412ff43</citedby><cites>FETCH-LOGICAL-c453t-6d8a43465f6fe12229e085dd2d6205c7edd513cefb20ef3aa51edd6bb412ff43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,881,27901,27902</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=26336578$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/22796958$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://hal.inrae.fr/hal-02648001$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>RIGAILL, Guillem J</creatorcontrib><creatorcontrib>CADOT, Sidney</creatorcontrib><creatorcontrib>KLUIN, Roelof J. C</creatorcontrib><creatorcontrib>ZHENG XUE</creatorcontrib><creatorcontrib>BERNARDS, Rene</creatorcontrib><creatorcontrib>MAJEWSKI, Ian J</creatorcontrib><creatorcontrib>WESSELS, Lodewyk F. A</creatorcontrib><title>A regression model for estimating DNA copy number applied to capture sequencing data</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus.
We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data.
The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/
l.wessels@nki.nl
Supplementary data are available at Bioinformatics online.</description><subject>Algorithms</subject><subject>Biological and medical sciences</subject><subject>Breast Neoplasms - genetics</subject><subject>Cell Line, Tumor</subject><subject>DNA Copy Number Variations</subject><subject>Female</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Genomics - methods</subject><subject>Genotype</subject><subject>Humans</subject><subject>Life Sciences</subject><subject>Linear Models</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Sequence Analysis, DNA</subject><subject>Vegetal Biology</subject><issn>1367-4803</issn><issn>1367-4811</issn><issn>1460-2059</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVkMtKAzEUhoMo3h9ByUbQRW3uM10W71B0033IJCd1ZGYyJjOCb29Ka8VVwuE7f_58CF1QckvJjE-rOtSdD7E1Q23TtBqSEOUeOqZcFRNRUrq_uxN-hE5S-iCESCLVITpirJipmSyP0XKOI6wipFSHDrfBQYNzKoY01OvoboXvX-fYhv4bd2NbQcSm75saHB4CtqYfxgg4wecInV3TzgzmDB140yQ4356naPn4sLx7nizenl7u5ouJFZIPE-VKI7hQ0isPlDE2A1JK55hTjEhbgHOScgu-YgQ8N0bSPFJVJSjzXvBTdLOJfTeN7mPuG791MLV-ni_0ekaYyr8n9Itm9nrD9jHkrmnQbZ0sNI3pIIxJ01yEMC54kVG5QW0MKUXwu2xK9Nq9_u9eb9znvcvtE2PVgttt_crOwNUWMMmaxkeTlaU_TnGuZFHyH0ACkqA</recordid><startdate>20120915</startdate><enddate>20120915</enddate><creator>RIGAILL, Guillem J</creator><creator>CADOT, Sidney</creator><creator>KLUIN, Roelof J. C</creator><creator>ZHENG XUE</creator><creator>BERNARDS, Rene</creator><creator>MAJEWSKI, Ian J</creator><creator>WESSELS, Lodewyk F. A</creator><general>Oxford University Press</general><general>Oxford University Press (OUP)</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7TM</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>1XC</scope></search><sort><creationdate>20120915</creationdate><title>A regression model for estimating DNA copy number applied to capture sequencing data</title><author>RIGAILL, Guillem J ; CADOT, Sidney ; KLUIN, Roelof J. C ; ZHENG XUE ; BERNARDS, Rene ; MAJEWSKI, Ian J ; WESSELS, Lodewyk F. A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c453t-6d8a43465f6fe12229e085dd2d6205c7edd513cefb20ef3aa51edd6bb412ff43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Algorithms</topic><topic>Biological and medical sciences</topic><topic>Breast Neoplasms - genetics</topic><topic>Cell Line, Tumor</topic><topic>DNA Copy Number Variations</topic><topic>Female</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Genomics - methods</topic><topic>Genotype</topic><topic>Humans</topic><topic>Life Sciences</topic><topic>Linear Models</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Sequence Analysis, DNA</topic><topic>Vegetal Biology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>RIGAILL, Guillem J</creatorcontrib><creatorcontrib>CADOT, Sidney</creatorcontrib><creatorcontrib>KLUIN, Roelof J. C</creatorcontrib><creatorcontrib>ZHENG XUE</creatorcontrib><creatorcontrib>BERNARDS, Rene</creatorcontrib><creatorcontrib>MAJEWSKI, Ian J</creatorcontrib><creatorcontrib>WESSELS, Lodewyk F. A</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Hyper Article en Ligne (HAL)</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>RIGAILL, Guillem J</au><au>CADOT, Sidney</au><au>KLUIN, Roelof J. C</au><au>ZHENG XUE</au><au>BERNARDS, Rene</au><au>MAJEWSKI, Ian J</au><au>WESSELS, Lodewyk F. A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A regression model for estimating DNA copy number applied to capture sequencing data</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2012-09-15</date><risdate>2012</risdate><volume>28</volume><issue>18</issue><spage>2357</spage><epage>2365</epage><pages>2357-2365</pages><issn>1367-4803</issn><eissn>1367-4811</eissn><eissn>1460-2059</eissn><abstract>Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus.
We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data.
The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/
l.wessels@nki.nl
Supplementary data are available at Bioinformatics online.</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>22796958</pmid><doi>10.1093/bioinformatics/bts448</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1367-4803 |
ispartof | Bioinformatics, 2012-09, Vol.28 (18), p.2357-2365 |
issn | 1367-4803 1367-4811 1460-2059 |
language | eng |
recordid | cdi_hal_primary_oai_HAL_hal_02648001v1 |
source | PubMed (Medline); Oxford Journals Open Access Collection; MEDLINE; Alma/SFX Local Collection; EZB Electronic Journals Library |
subjects | Algorithms Biological and medical sciences Breast Neoplasms - genetics Cell Line, Tumor DNA Copy Number Variations Female Fundamental and applied biological sciences. Psychology General aspects Genomics - methods Genotype Humans Life Sciences Linear Models Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Polymorphism, Single Nucleotide Sequence Analysis, DNA Vegetal Biology |
title | A regression model for estimating DNA copy number applied to capture sequencing data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T14%3A16%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20regression%20model%20for%20estimating%20DNA%20copy%20number%20applied%20to%20capture%20sequencing%20data&rft.jtitle=Bioinformatics&rft.au=RIGAILL,%20Guillem%20J&rft.date=2012-09-15&rft.volume=28&rft.issue=18&rft.spage=2357&rft.epage=2365&rft.pages=2357-2365&rft.issn=1367-4803&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/bts448&rft_dat=%3Cproquest_hal_p%3E1434023437%3C/proquest_hal_p%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1434023437&rft_id=info:pmid/22796958&rfr_iscdi=true |