High-throughput, high-accuracy array-based resequencing

Although genomewide association studies have successfully identified associations of many common single-nucleotide polymorphisms (SNPs) with common diseases, the SNPs implicated so far account for only a small proportion of the genetic variability of tested diseases. It has been suggested that commo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the National Academy of Sciences - PNAS 2009-04, Vol.106 (16), p.6712-6717
Hauptverfasser: Zheng, Jianbiao, Moorhead, Martin, Weng, Li, Siddiqui, Farooq, Carlton, Victoria E.H, Ireland, James S, Lee, Liana, Peterson, Joseph, Wilkins, Jennifer, Lin, Sean, Kan, Zhengyan, Seshagiri, Somasekar, Davis, Ronald W, Faham, Malek
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 6717
container_issue 16
container_start_page 6712
container_title Proceedings of the National Academy of Sciences - PNAS
container_volume 106
creator Zheng, Jianbiao
Moorhead, Martin
Weng, Li
Siddiqui, Farooq
Carlton, Victoria E.H
Ireland, James S
Lee, Liana
Peterson, Joseph
Wilkins, Jennifer
Lin, Sean
Kan, Zhengyan
Seshagiri, Somasekar
Davis, Ronald W
Faham, Malek
description Although genomewide association studies have successfully identified associations of many common single-nucleotide polymorphisms (SNPs) with common diseases, the SNPs implicated so far account for only a small proportion of the genetic variability of tested diseases. It has been suggested that common diseases may often be caused by rare alleles missed by genomewide association studies. To identify these rare alleles we need high-throughput, high-accuracy resequencing technologies. Although array-based genotyping has allowed genomewide association studies of common SNPs in tens of thousands of samples, array-based resequencing has been limited for 2 main reasons: the lack of a fully multiplexed pipeline for high-throughput sample processing, and failure to achieve sufficient performance. We have recently solved both of these problems and created a fully multiplexed high-throughput pipeline that results in high-quality data. The pipeline consists of target amplification from genomic DNA, followed by allele enrichment to generate pools of purified variant (or nonvariant) DNA and ends with interrogation of purified DNA on resequencing arrays. We have used this pipeline to resequence [almost equal to]5 Mb of DNA (on 3 arrays) corresponding to the exons of 1,500 genes in >473 samples; in total >2,350 Mb were sequenced. In the context of this large-scale study we obtained a false positive rate of [almost equal to]1 in 500,000 bp and a false negative rate of [almost equal to]10%.
doi_str_mv 10.1073/pnas.0901902106
format Article
fullrecord <record><control><sourceid>jstor_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1073_pnas_0901902106</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>40482162</jstor_id><sourcerecordid>40482162</sourcerecordid><originalsourceid>FETCH-LOGICAL-c521t-30ce72520e0348ba5636d1d74a75662645914295e9bfec343893baeaa41020ed3</originalsourceid><addsrcrecordid>eNp9kc1v1DAQxS0EokvhzAlYcag4kHbGdpz4goQqoEiVOEDP1sRxPlbZeLET1P3v8WqjLnDgZGnmN8_v6TH2EuESoRBXu5HiJWhADRxBPWIrBI2ZkhoesxUAL7JScnnGnsW4AQCdl_CUnaEWaVrqFStu-rbLpi74ue128_R-3R0GZO0cyO7XFALts4qiq9fBRfdzdqPtx_Y5e9LQEN2L5T1nd58__bi-yW6_ffl6_fE2sznHKRNgXcFzDg6ELCvKlVA11oWkIleKK5lrlFznTleNs0KKUouKHJFESEe1OGcfjrq7udq62rpxCjSYXei3FPbGU2_-3ox9Z1r_y3CV_hUqCVwsAsEn83Ey2z5aNww0Oj9HkzDUGosEvv0H3Pg5jCmc4YASNSqeoKsjZIOPMbjmwQmCOTRiDo2YUyPp4vWfAU78UkEC3i3A4fIkpwyq5A65aeZhmNz9lNA3_0cT8epIbOLkwwMiQZb86H9RaMgbakMfzd33FE9A2vJcSfEb4zqxLA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>201419162</pqid></control><display><type>article</type><title>High-throughput, high-accuracy array-based resequencing</title><source>Jstor Complete Legacy</source><source>MEDLINE</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Zheng, Jianbiao ; Moorhead, Martin ; Weng, Li ; Siddiqui, Farooq ; Carlton, Victoria E.H ; Ireland, James S ; Lee, Liana ; Peterson, Joseph ; Wilkins, Jennifer ; Lin, Sean ; Kan, Zhengyan ; Seshagiri, Somasekar ; Davis, Ronald W ; Faham, Malek</creator><creatorcontrib>Zheng, Jianbiao ; Moorhead, Martin ; Weng, Li ; Siddiqui, Farooq ; Carlton, Victoria E.H ; Ireland, James S ; Lee, Liana ; Peterson, Joseph ; Wilkins, Jennifer ; Lin, Sean ; Kan, Zhengyan ; Seshagiri, Somasekar ; Davis, Ronald W ; Faham, Malek</creatorcontrib><description>Although genomewide association studies have successfully identified associations of many common single-nucleotide polymorphisms (SNPs) with common diseases, the SNPs implicated so far account for only a small proportion of the genetic variability of tested diseases. It has been suggested that common diseases may often be caused by rare alleles missed by genomewide association studies. To identify these rare alleles we need high-throughput, high-accuracy resequencing technologies. Although array-based genotyping has allowed genomewide association studies of common SNPs in tens of thousands of samples, array-based resequencing has been limited for 2 main reasons: the lack of a fully multiplexed pipeline for high-throughput sample processing, and failure to achieve sufficient performance. We have recently solved both of these problems and created a fully multiplexed high-throughput pipeline that results in high-quality data. The pipeline consists of target amplification from genomic DNA, followed by allele enrichment to generate pools of purified variant (or nonvariant) DNA and ends with interrogation of purified DNA on resequencing arrays. We have used this pipeline to resequence [almost equal to]5 Mb of DNA (on 3 arrays) corresponding to the exons of 1,500 genes in &gt;473 samples; in total &gt;2,350 Mb were sequenced. In the context of this large-scale study we obtained a false positive rate of [almost equal to]1 in 500,000 bp and a false negative rate of [almost equal to]10%.</description><identifier>ISSN: 0027-8424</identifier><identifier>EISSN: 1091-6490</identifier><identifier>DOI: 10.1073/pnas.0901902106</identifier><identifier>PMID: 19342489</identifier><language>eng</language><publisher>United States: National Academy of Sciences</publisher><subject>Alleles ; Automation ; Base Pair Mismatch ; Biological Sciences ; Complex Systems: From Chemistry to Systems Biology Special Feature ; Correlation analysis ; Deoxyribonucleic acid ; Diabetes ; Disease ; DNA ; DNA probes ; False positive errors ; Gene expression ; Genetic diseases ; Genome, Human - genetics ; Genomics ; Genotype &amp; phenotype ; Humans ; Mutation - genetics ; Oligonucleotide Array Sequence Analysis ; Pipelines ; Polymorphism ; Ratio analysis ; ROC Curve ; Sequence Analysis, DNA - methods ; Sequence Analysis, DNA - standards ; Sequencing</subject><ispartof>Proceedings of the National Academy of Sciences - PNAS, 2009-04, Vol.106 (16), p.6712-6717</ispartof><rights>Copyright National Academy of Sciences Apr 21, 2009</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c521t-30ce72520e0348ba5636d1d74a75662645914295e9bfec343893baeaa41020ed3</citedby><cites>FETCH-LOGICAL-c521t-30ce72520e0348ba5636d1d74a75662645914295e9bfec343893baeaa41020ed3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttp://www.pnas.org/content/106/16.cover.gif</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/40482162$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/40482162$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,723,776,780,799,881,27901,27902,53766,53768,57992,58225</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19342489$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zheng, Jianbiao</creatorcontrib><creatorcontrib>Moorhead, Martin</creatorcontrib><creatorcontrib>Weng, Li</creatorcontrib><creatorcontrib>Siddiqui, Farooq</creatorcontrib><creatorcontrib>Carlton, Victoria E.H</creatorcontrib><creatorcontrib>Ireland, James S</creatorcontrib><creatorcontrib>Lee, Liana</creatorcontrib><creatorcontrib>Peterson, Joseph</creatorcontrib><creatorcontrib>Wilkins, Jennifer</creatorcontrib><creatorcontrib>Lin, Sean</creatorcontrib><creatorcontrib>Kan, Zhengyan</creatorcontrib><creatorcontrib>Seshagiri, Somasekar</creatorcontrib><creatorcontrib>Davis, Ronald W</creatorcontrib><creatorcontrib>Faham, Malek</creatorcontrib><title>High-throughput, high-accuracy array-based resequencing</title><title>Proceedings of the National Academy of Sciences - PNAS</title><addtitle>Proc Natl Acad Sci U S A</addtitle><description>Although genomewide association studies have successfully identified associations of many common single-nucleotide polymorphisms (SNPs) with common diseases, the SNPs implicated so far account for only a small proportion of the genetic variability of tested diseases. It has been suggested that common diseases may often be caused by rare alleles missed by genomewide association studies. To identify these rare alleles we need high-throughput, high-accuracy resequencing technologies. Although array-based genotyping has allowed genomewide association studies of common SNPs in tens of thousands of samples, array-based resequencing has been limited for 2 main reasons: the lack of a fully multiplexed pipeline for high-throughput sample processing, and failure to achieve sufficient performance. We have recently solved both of these problems and created a fully multiplexed high-throughput pipeline that results in high-quality data. The pipeline consists of target amplification from genomic DNA, followed by allele enrichment to generate pools of purified variant (or nonvariant) DNA and ends with interrogation of purified DNA on resequencing arrays. We have used this pipeline to resequence [almost equal to]5 Mb of DNA (on 3 arrays) corresponding to the exons of 1,500 genes in &gt;473 samples; in total &gt;2,350 Mb were sequenced. In the context of this large-scale study we obtained a false positive rate of [almost equal to]1 in 500,000 bp and a false negative rate of [almost equal to]10%.</description><subject>Alleles</subject><subject>Automation</subject><subject>Base Pair Mismatch</subject><subject>Biological Sciences</subject><subject>Complex Systems: From Chemistry to Systems Biology Special Feature</subject><subject>Correlation analysis</subject><subject>Deoxyribonucleic acid</subject><subject>Diabetes</subject><subject>Disease</subject><subject>DNA</subject><subject>DNA probes</subject><subject>False positive errors</subject><subject>Gene expression</subject><subject>Genetic diseases</subject><subject>Genome, Human - genetics</subject><subject>Genomics</subject><subject>Genotype &amp; phenotype</subject><subject>Humans</subject><subject>Mutation - genetics</subject><subject>Oligonucleotide Array Sequence Analysis</subject><subject>Pipelines</subject><subject>Polymorphism</subject><subject>Ratio analysis</subject><subject>ROC Curve</subject><subject>Sequence Analysis, DNA - methods</subject><subject>Sequence Analysis, DNA - standards</subject><subject>Sequencing</subject><issn>0027-8424</issn><issn>1091-6490</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kc1v1DAQxS0EokvhzAlYcag4kHbGdpz4goQqoEiVOEDP1sRxPlbZeLET1P3v8WqjLnDgZGnmN8_v6TH2EuESoRBXu5HiJWhADRxBPWIrBI2ZkhoesxUAL7JScnnGnsW4AQCdl_CUnaEWaVrqFStu-rbLpi74ue128_R-3R0GZO0cyO7XFALts4qiq9fBRfdzdqPtx_Y5e9LQEN2L5T1nd58__bi-yW6_ffl6_fE2sznHKRNgXcFzDg6ELCvKlVA11oWkIleKK5lrlFznTleNs0KKUouKHJFESEe1OGcfjrq7udq62rpxCjSYXei3FPbGU2_-3ox9Z1r_y3CV_hUqCVwsAsEn83Ey2z5aNww0Oj9HkzDUGosEvv0H3Pg5jCmc4YASNSqeoKsjZIOPMbjmwQmCOTRiDo2YUyPp4vWfAU78UkEC3i3A4fIkpwyq5A65aeZhmNz9lNA3_0cT8epIbOLkwwMiQZb86H9RaMgbakMfzd33FE9A2vJcSfEb4zqxLA</recordid><startdate>20090421</startdate><enddate>20090421</enddate><creator>Zheng, Jianbiao</creator><creator>Moorhead, Martin</creator><creator>Weng, Li</creator><creator>Siddiqui, Farooq</creator><creator>Carlton, Victoria E.H</creator><creator>Ireland, James S</creator><creator>Lee, Liana</creator><creator>Peterson, Joseph</creator><creator>Wilkins, Jennifer</creator><creator>Lin, Sean</creator><creator>Kan, Zhengyan</creator><creator>Seshagiri, Somasekar</creator><creator>Davis, Ronald W</creator><creator>Faham, Malek</creator><general>National Academy of Sciences</general><general>National Acad Sciences</general><scope>FBQ</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QG</scope><scope>7QL</scope><scope>7QP</scope><scope>7QR</scope><scope>7SN</scope><scope>7SS</scope><scope>7T5</scope><scope>7TK</scope><scope>7TM</scope><scope>7TO</scope><scope>7U9</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>H94</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20090421</creationdate><title>High-throughput, high-accuracy array-based resequencing</title><author>Zheng, Jianbiao ; Moorhead, Martin ; Weng, Li ; Siddiqui, Farooq ; Carlton, Victoria E.H ; Ireland, James S ; Lee, Liana ; Peterson, Joseph ; Wilkins, Jennifer ; Lin, Sean ; Kan, Zhengyan ; Seshagiri, Somasekar ; Davis, Ronald W ; Faham, Malek</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c521t-30ce72520e0348ba5636d1d74a75662645914295e9bfec343893baeaa41020ed3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Alleles</topic><topic>Automation</topic><topic>Base Pair Mismatch</topic><topic>Biological Sciences</topic><topic>Complex Systems: From Chemistry to Systems Biology Special Feature</topic><topic>Correlation analysis</topic><topic>Deoxyribonucleic acid</topic><topic>Diabetes</topic><topic>Disease</topic><topic>DNA</topic><topic>DNA probes</topic><topic>False positive errors</topic><topic>Gene expression</topic><topic>Genetic diseases</topic><topic>Genome, Human - genetics</topic><topic>Genomics</topic><topic>Genotype &amp; phenotype</topic><topic>Humans</topic><topic>Mutation - genetics</topic><topic>Oligonucleotide Array Sequence Analysis</topic><topic>Pipelines</topic><topic>Polymorphism</topic><topic>Ratio analysis</topic><topic>ROC Curve</topic><topic>Sequence Analysis, DNA - methods</topic><topic>Sequence Analysis, DNA - standards</topic><topic>Sequencing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zheng, Jianbiao</creatorcontrib><creatorcontrib>Moorhead, Martin</creatorcontrib><creatorcontrib>Weng, Li</creatorcontrib><creatorcontrib>Siddiqui, Farooq</creatorcontrib><creatorcontrib>Carlton, Victoria E.H</creatorcontrib><creatorcontrib>Ireland, James S</creatorcontrib><creatorcontrib>Lee, Liana</creatorcontrib><creatorcontrib>Peterson, Joseph</creatorcontrib><creatorcontrib>Wilkins, Jennifer</creatorcontrib><creatorcontrib>Lin, Sean</creatorcontrib><creatorcontrib>Kan, Zhengyan</creatorcontrib><creatorcontrib>Seshagiri, Somasekar</creatorcontrib><creatorcontrib>Davis, Ronald W</creatorcontrib><creatorcontrib>Faham, Malek</creatorcontrib><collection>AGRIS</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Immunology Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zheng, Jianbiao</au><au>Moorhead, Martin</au><au>Weng, Li</au><au>Siddiqui, Farooq</au><au>Carlton, Victoria E.H</au><au>Ireland, James S</au><au>Lee, Liana</au><au>Peterson, Joseph</au><au>Wilkins, Jennifer</au><au>Lin, Sean</au><au>Kan, Zhengyan</au><au>Seshagiri, Somasekar</au><au>Davis, Ronald W</au><au>Faham, Malek</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>High-throughput, high-accuracy array-based resequencing</atitle><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle><addtitle>Proc Natl Acad Sci U S A</addtitle><date>2009-04-21</date><risdate>2009</risdate><volume>106</volume><issue>16</issue><spage>6712</spage><epage>6717</epage><pages>6712-6717</pages><issn>0027-8424</issn><eissn>1091-6490</eissn><abstract>Although genomewide association studies have successfully identified associations of many common single-nucleotide polymorphisms (SNPs) with common diseases, the SNPs implicated so far account for only a small proportion of the genetic variability of tested diseases. It has been suggested that common diseases may often be caused by rare alleles missed by genomewide association studies. To identify these rare alleles we need high-throughput, high-accuracy resequencing technologies. Although array-based genotyping has allowed genomewide association studies of common SNPs in tens of thousands of samples, array-based resequencing has been limited for 2 main reasons: the lack of a fully multiplexed pipeline for high-throughput sample processing, and failure to achieve sufficient performance. We have recently solved both of these problems and created a fully multiplexed high-throughput pipeline that results in high-quality data. The pipeline consists of target amplification from genomic DNA, followed by allele enrichment to generate pools of purified variant (or nonvariant) DNA and ends with interrogation of purified DNA on resequencing arrays. We have used this pipeline to resequence [almost equal to]5 Mb of DNA (on 3 arrays) corresponding to the exons of 1,500 genes in &gt;473 samples; in total &gt;2,350 Mb were sequenced. In the context of this large-scale study we obtained a false positive rate of [almost equal to]1 in 500,000 bp and a false negative rate of [almost equal to]10%.</abstract><cop>United States</cop><pub>National Academy of Sciences</pub><pmid>19342489</pmid><doi>10.1073/pnas.0901902106</doi><tpages>6</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0027-8424
ispartof Proceedings of the National Academy of Sciences - PNAS, 2009-04, Vol.106 (16), p.6712-6717
issn 0027-8424
1091-6490
language eng
recordid cdi_crossref_primary_10_1073_pnas_0901902106
source Jstor Complete Legacy; MEDLINE; PubMed Central; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects Alleles
Automation
Base Pair Mismatch
Biological Sciences
Complex Systems: From Chemistry to Systems Biology Special Feature
Correlation analysis
Deoxyribonucleic acid
Diabetes
Disease
DNA
DNA probes
False positive errors
Gene expression
Genetic diseases
Genome, Human - genetics
Genomics
Genotype & phenotype
Humans
Mutation - genetics
Oligonucleotide Array Sequence Analysis
Pipelines
Polymorphism
Ratio analysis
ROC Curve
Sequence Analysis, DNA - methods
Sequence Analysis, DNA - standards
Sequencing
title High-throughput, high-accuracy array-based resequencing
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T14%3A53%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=High-throughput,%20high-accuracy%20array-based%20resequencing&rft.jtitle=Proceedings%20of%20the%20National%20Academy%20of%20Sciences%20-%20PNAS&rft.au=Zheng,%20Jianbiao&rft.date=2009-04-21&rft.volume=106&rft.issue=16&rft.spage=6712&rft.epage=6717&rft.pages=6712-6717&rft.issn=0027-8424&rft.eissn=1091-6490&rft_id=info:doi/10.1073/pnas.0901902106&rft_dat=%3Cjstor_cross%3E40482162%3C/jstor_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=201419162&rft_id=info:pmid/19342489&rft_jstor_id=40482162&rfr_iscdi=true