A regression model for estimating DNA copy number applied to capture sequencing data

Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Bioinformatics 2012-09, Vol.28 (18), p.2357-2365
Hauptverfasser:	RIGAILL, Guillem J, CADOT, Sidney, KLUIN, Roelof J. C, ZHENG XUE, BERNARDS, Rene, MAJEWSKI, Ian J, WESSELS, Lodewyk F. A
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Biological and medical sciences Breast Neoplasms - genetics Cell Line, Tumor DNA Copy Number Variations Female Fundamental and applied biological sciences. Psychology General aspects Genomics - methods Genotype Humans Life Sciences Linear Models Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Polymorphism, Single Nucleotide Sequence Analysis, DNA Vegetal Biology
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	2365
container_issue	18
container_start_page	2357
container_title	Bioinformatics
container_volume	28
creator	RIGAILL, Guillem J CADOT, Sidney KLUIN, Roelof J. C ZHENG XUE BERNARDS, Rene MAJEWSKI, Ian J WESSELS, Lodewyk F. A
description	Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus. We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data. The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/ l.wessels@nki.nl Supplementary data are available at Bioinformatics online.
doi_str_mv	10.1093/bioinformatics/bts448
format	Article
fullrecord	<record><control><sourceid>proquest_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_02648001v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1434023437</sourcerecordid><originalsourceid>FETCH-LOGICAL-c453t-6d8a43465f6fe12229e085dd2d6205c7edd513cefb20ef3aa51edd6bb412ff43</originalsourceid><addsrcrecordid>eNpVkMtKAzEUhoMo3h9ByUbQRW3uM10W71B0033IJCd1ZGYyJjOCb29Ka8VVwuE7f_58CF1QckvJjE-rOtSdD7E1Q23TtBqSEOUeOqZcFRNRUrq_uxN-hE5S-iCESCLVITpirJipmSyP0XKOI6wipFSHDrfBQYNzKoY01OvoboXvX-fYhv4bd2NbQcSm75saHB4CtqYfxgg4wecInV3TzgzmDB140yQ4356naPn4sLx7nizenl7u5ouJFZIPE-VKI7hQ0isPlDE2A1JK55hTjEhbgHOScgu-YgQ8N0bSPFJVJSjzXvBTdLOJfTeN7mPuG791MLV-ni_0ekaYyr8n9Itm9nrD9jHkrmnQbZ0sNI3pIIxJ01yEMC54kVG5QW0MKUXwu2xK9Nq9_u9eb9znvcvtE2PVgttt_crOwNUWMMmaxkeTlaU_TnGuZFHyH0ACkqA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1434023437</pqid></control><display><type>article</type><title>A regression model for estimating DNA copy number applied to capture sequencing data</title><source>PubMed (Medline)</source><source>Oxford Journals Open Access Collection</source><source>MEDLINE</source><source>Alma/SFX Local Collection</source><source>EZB Electronic Journals Library</source><creator>RIGAILL, Guillem J ; CADOT, Sidney ; KLUIN, Roelof J. C ; ZHENG XUE ; BERNARDS, Rene ; MAJEWSKI, Ian J ; WESSELS, Lodewyk F. A</creator><creatorcontrib>RIGAILL, Guillem J ; CADOT, Sidney ; KLUIN, Roelof J. C ; ZHENG XUE ; BERNARDS, Rene ; MAJEWSKI, Ian J ; WESSELS, Lodewyk F. A</creatorcontrib><description>Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus. We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data. The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/ l.wessels@nki.nl Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>EISSN: 1460-2059</identifier><identifier>DOI: 10.1093/bioinformatics/bts448</identifier><identifier>PMID: 22796958</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Algorithms ; Biological and medical sciences ; Breast Neoplasms - genetics ; Cell Line, Tumor ; DNA Copy Number Variations ; Female ; Fundamental and applied biological sciences. Psychology ; General aspects ; Genomics - methods ; Genotype ; Humans ; Life Sciences ; Linear Models ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Polymorphism, Single Nucleotide ; Sequence Analysis, DNA ; Vegetal Biology</subject><ispartof>Bioinformatics, 2012-09, Vol.28 (18), p.2357-2365</ispartof><rights>2015 INIST-CNRS</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c453t-6d8a43465f6fe12229e085dd2d6205c7edd513cefb20ef3aa51edd6bb412ff43</citedby><cites>FETCH-LOGICAL-c453t-6d8a43465f6fe12229e085dd2d6205c7edd513cefb20ef3aa51edd6bb412ff43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,881,27901,27902</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=26336578$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/22796958$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://hal.inrae.fr/hal-02648001$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>RIGAILL, Guillem J</creatorcontrib><creatorcontrib>CADOT, Sidney</creatorcontrib><creatorcontrib>KLUIN, Roelof J. C</creatorcontrib><creatorcontrib>ZHENG XUE</creatorcontrib><creatorcontrib>BERNARDS, Rene</creatorcontrib><creatorcontrib>MAJEWSKI, Ian J</creatorcontrib><creatorcontrib>WESSELS, Lodewyk F. A</creatorcontrib><title>A regression model for estimating DNA copy number applied to capture sequencing data</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus. We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data. The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/ l.wessels@nki.nl Supplementary data are available at Bioinformatics online.</description><subject>Algorithms</subject><subject>Biological and medical sciences</subject><subject>Breast Neoplasms - genetics</subject><subject>Cell Line, Tumor</subject><subject>DNA Copy Number Variations</subject><subject>Female</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Genomics - methods</subject><subject>Genotype</subject><subject>Humans</subject><subject>Life Sciences</subject><subject>Linear Models</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Sequence Analysis, DNA</subject><subject>Vegetal Biology</subject><issn>1367-4803</issn><issn>1367-4811</issn><issn>1460-2059</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVkMtKAzEUhoMo3h9ByUbQRW3uM10W71B0033IJCd1ZGYyJjOCb29Ka8VVwuE7f_58CF1QckvJjE-rOtSdD7E1Q23TtBqSEOUeOqZcFRNRUrq_uxN-hE5S-iCESCLVITpirJipmSyP0XKOI6wipFSHDrfBQYNzKoY01OvoboXvX-fYhv4bd2NbQcSm75saHB4CtqYfxgg4wecInV3TzgzmDB140yQ4356naPn4sLx7nizenl7u5ouJFZIPE-VKI7hQ0isPlDE2A1JK55hTjEhbgHOScgu-YgQ8N0bSPFJVJSjzXvBTdLOJfTeN7mPuG791MLV-ni_0ekaYyr8n9Itm9nrD9jHkrmnQbZ0sNI3pIIxJ01yEMC54kVG5QW0MKUXwu2xK9Nq9_u9eb9znvcvtE2PVgttt_crOwNUWMMmaxkeTlaU_TnGuZFHyH0ACkqA</recordid><startdate>20120915</startdate><enddate>20120915</enddate><creator>RIGAILL, Guillem J</creator><creator>CADOT, Sidney</creator><creator>KLUIN, Roelof J. C</creator><creator>ZHENG XUE</creator><creator>BERNARDS, Rene</creator><creator>MAJEWSKI, Ian J</creator><creator>WESSELS, Lodewyk F. A</creator><general>Oxford University Press</general><general>Oxford University Press (OUP)</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7TM</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>1XC</scope></search><sort><creationdate>20120915</creationdate><title>A regression model for estimating DNA copy number applied to capture sequencing data</title><author>RIGAILL, Guillem J ; CADOT, Sidney ; KLUIN, Roelof J. C ; ZHENG XUE ; BERNARDS, Rene ; MAJEWSKI, Ian J ; WESSELS, Lodewyk F. A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c453t-6d8a43465f6fe12229e085dd2d6205c7edd513cefb20ef3aa51edd6bb412ff43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Algorithms</topic><topic>Biological and medical sciences</topic><topic>Breast Neoplasms - genetics</topic><topic>Cell Line, Tumor</topic><topic>DNA Copy Number Variations</topic><topic>Female</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Genomics - methods</topic><topic>Genotype</topic><topic>Humans</topic><topic>Life Sciences</topic><topic>Linear Models</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Sequence Analysis, DNA</topic><topic>Vegetal Biology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>RIGAILL, Guillem J</creatorcontrib><creatorcontrib>CADOT, Sidney</creatorcontrib><creatorcontrib>KLUIN, Roelof J. C</creatorcontrib><creatorcontrib>ZHENG XUE</creatorcontrib><creatorcontrib>BERNARDS, Rene</creatorcontrib><creatorcontrib>MAJEWSKI, Ian J</creatorcontrib><creatorcontrib>WESSELS, Lodewyk F. A</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Hyper Article en Ligne (HAL)</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>RIGAILL, Guillem J</au><au>CADOT, Sidney</au><au>KLUIN, Roelof J. C</au><au>ZHENG XUE</au><au>BERNARDS, Rene</au><au>MAJEWSKI, Ian J</au><au>WESSELS, Lodewyk F. A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A regression model for estimating DNA copy number applied to capture sequencing data</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2012-09-15</date><risdate>2012</risdate><volume>28</volume><issue>18</issue><spage>2357</spage><epage>2365</epage><pages>2357-2365</pages><issn>1367-4803</issn><eissn>1367-4811</eissn><eissn>1460-2059</eissn><abstract>Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus. We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data. The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/ l.wessels@nki.nl Supplementary data are available at Bioinformatics online.</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>22796958</pmid><doi>10.1093/bioinformatics/bts448</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1367-4803
ispartof	Bioinformatics, 2012-09, Vol.28 (18), p.2357-2365
issn	1367-4803 1367-4811 1460-2059
language	eng
recordid	cdi_hal_primary_oai_HAL_hal_02648001v1
source	PubMed (Medline); Oxford Journals Open Access Collection; MEDLINE; Alma/SFX Local Collection; EZB Electronic Journals Library
subjects	Algorithms Biological and medical sciences Breast Neoplasms - genetics Cell Line, Tumor DNA Copy Number Variations Female Fundamental and applied biological sciences. Psychology General aspects Genomics - methods Genotype Humans Life Sciences Linear Models Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Polymorphism, Single Nucleotide Sequence Analysis, DNA Vegetal Biology
title	A regression model for estimating DNA copy number applied to capture sequencing data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T14%3A16%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20regression%20model%20for%20estimating%20DNA%20copy%20number%20applied%20to%20capture%20sequencing%20data&rft.jtitle=Bioinformatics&rft.au=RIGAILL,%20Guillem%20J&rft.date=2012-09-15&rft.volume=28&rft.issue=18&rft.spage=2357&rft.epage=2365&rft.pages=2357-2365&rft.issn=1367-4803&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/bts448&rft_dat=%3Cproquest_hal_p%3E1434023437%3C/proquest_hal_p%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1434023437&rft_id=info:pmid/22796958&rfr_iscdi=true