High-throughput, high-accuracy array-based resequencing
Although genomewide association studies have successfully identified associations of many common single-nucleotide polymorphisms (SNPs) with common diseases, the SNPs implicated so far account for only a small proportion of the genetic variability of tested diseases. It has been suggested that commo...
Gespeichert in:
Veröffentlicht in: | Proceedings of the National Academy of Sciences - PNAS 2009-04, Vol.106 (16), p.6712-6717 |
---|---|
Hauptverfasser: | , , , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 6717 |
---|---|
container_issue | 16 |
container_start_page | 6712 |
container_title | Proceedings of the National Academy of Sciences - PNAS |
container_volume | 106 |
creator | Zheng, Jianbiao Moorhead, Martin Weng, Li Siddiqui, Farooq Carlton, Victoria E.H Ireland, James S Lee, Liana Peterson, Joseph Wilkins, Jennifer Lin, Sean Kan, Zhengyan Seshagiri, Somasekar Davis, Ronald W Faham, Malek |
description | Although genomewide association studies have successfully identified associations of many common single-nucleotide polymorphisms (SNPs) with common diseases, the SNPs implicated so far account for only a small proportion of the genetic variability of tested diseases. It has been suggested that common diseases may often be caused by rare alleles missed by genomewide association studies. To identify these rare alleles we need high-throughput, high-accuracy resequencing technologies. Although array-based genotyping has allowed genomewide association studies of common SNPs in tens of thousands of samples, array-based resequencing has been limited for 2 main reasons: the lack of a fully multiplexed pipeline for high-throughput sample processing, and failure to achieve sufficient performance. We have recently solved both of these problems and created a fully multiplexed high-throughput pipeline that results in high-quality data. The pipeline consists of target amplification from genomic DNA, followed by allele enrichment to generate pools of purified variant (or nonvariant) DNA and ends with interrogation of purified DNA on resequencing arrays. We have used this pipeline to resequence [almost equal to]5 Mb of DNA (on 3 arrays) corresponding to the exons of 1,500 genes in >473 samples; in total >2,350 Mb were sequenced. In the context of this large-scale study we obtained a false positive rate of [almost equal to]1 in 500,000 bp and a false negative rate of [almost equal to]10%. |
doi_str_mv | 10.1073/pnas.0901902106 |
format | Article |
fullrecord | <record><control><sourceid>jstor_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1073_pnas_0901902106</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>40482162</jstor_id><sourcerecordid>40482162</sourcerecordid><originalsourceid>FETCH-LOGICAL-c521t-30ce72520e0348ba5636d1d74a75662645914295e9bfec343893baeaa41020ed3</originalsourceid><addsrcrecordid>eNp9kc1v1DAQxS0EokvhzAlYcag4kHbGdpz4goQqoEiVOEDP1sRxPlbZeLET1P3v8WqjLnDgZGnmN8_v6TH2EuESoRBXu5HiJWhADRxBPWIrBI2ZkhoesxUAL7JScnnGnsW4AQCdl_CUnaEWaVrqFStu-rbLpi74ue128_R-3R0GZO0cyO7XFALts4qiq9fBRfdzdqPtx_Y5e9LQEN2L5T1nd58__bi-yW6_ffl6_fE2sznHKRNgXcFzDg6ELCvKlVA11oWkIleKK5lrlFznTleNs0KKUouKHJFESEe1OGcfjrq7udq62rpxCjSYXei3FPbGU2_-3ox9Z1r_y3CV_hUqCVwsAsEn83Ey2z5aNww0Oj9HkzDUGosEvv0H3Pg5jCmc4YASNSqeoKsjZIOPMbjmwQmCOTRiDo2YUyPp4vWfAU78UkEC3i3A4fIkpwyq5A65aeZhmNz9lNA3_0cT8epIbOLkwwMiQZb86H9RaMgbakMfzd33FE9A2vJcSfEb4zqxLA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>201419162</pqid></control><display><type>article</type><title>High-throughput, high-accuracy array-based resequencing</title><source>Jstor Complete Legacy</source><source>MEDLINE</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Zheng, Jianbiao ; Moorhead, Martin ; Weng, Li ; Siddiqui, Farooq ; Carlton, Victoria E.H ; Ireland, James S ; Lee, Liana ; Peterson, Joseph ; Wilkins, Jennifer ; Lin, Sean ; Kan, Zhengyan ; Seshagiri, Somasekar ; Davis, Ronald W ; Faham, Malek</creator><creatorcontrib>Zheng, Jianbiao ; Moorhead, Martin ; Weng, Li ; Siddiqui, Farooq ; Carlton, Victoria E.H ; Ireland, James S ; Lee, Liana ; Peterson, Joseph ; Wilkins, Jennifer ; Lin, Sean ; Kan, Zhengyan ; Seshagiri, Somasekar ; Davis, Ronald W ; Faham, Malek</creatorcontrib><description>Although genomewide association studies have successfully identified associations of many common single-nucleotide polymorphisms (SNPs) with common diseases, the SNPs implicated so far account for only a small proportion of the genetic variability of tested diseases. It has been suggested that common diseases may often be caused by rare alleles missed by genomewide association studies. To identify these rare alleles we need high-throughput, high-accuracy resequencing technologies. Although array-based genotyping has allowed genomewide association studies of common SNPs in tens of thousands of samples, array-based resequencing has been limited for 2 main reasons: the lack of a fully multiplexed pipeline for high-throughput sample processing, and failure to achieve sufficient performance. We have recently solved both of these problems and created a fully multiplexed high-throughput pipeline that results in high-quality data. The pipeline consists of target amplification from genomic DNA, followed by allele enrichment to generate pools of purified variant (or nonvariant) DNA and ends with interrogation of purified DNA on resequencing arrays. We have used this pipeline to resequence [almost equal to]5 Mb of DNA (on 3 arrays) corresponding to the exons of 1,500 genes in >473 samples; in total >2,350 Mb were sequenced. In the context of this large-scale study we obtained a false positive rate of [almost equal to]1 in 500,000 bp and a false negative rate of [almost equal to]10%.</description><identifier>ISSN: 0027-8424</identifier><identifier>EISSN: 1091-6490</identifier><identifier>DOI: 10.1073/pnas.0901902106</identifier><identifier>PMID: 19342489</identifier><language>eng</language><publisher>United States: National Academy of Sciences</publisher><subject>Alleles ; Automation ; Base Pair Mismatch ; Biological Sciences ; Complex Systems: From Chemistry to Systems Biology Special Feature ; Correlation analysis ; Deoxyribonucleic acid ; Diabetes ; Disease ; DNA ; DNA probes ; False positive errors ; Gene expression ; Genetic diseases ; Genome, Human - genetics ; Genomics ; Genotype & phenotype ; Humans ; Mutation - genetics ; Oligonucleotide Array Sequence Analysis ; Pipelines ; Polymorphism ; Ratio analysis ; ROC Curve ; Sequence Analysis, DNA - methods ; Sequence Analysis, DNA - standards ; Sequencing</subject><ispartof>Proceedings of the National Academy of Sciences - PNAS, 2009-04, Vol.106 (16), p.6712-6717</ispartof><rights>Copyright National Academy of Sciences Apr 21, 2009</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c521t-30ce72520e0348ba5636d1d74a75662645914295e9bfec343893baeaa41020ed3</citedby><cites>FETCH-LOGICAL-c521t-30ce72520e0348ba5636d1d74a75662645914295e9bfec343893baeaa41020ed3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttp://www.pnas.org/content/106/16.cover.gif</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/40482162$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/40482162$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,723,776,780,799,881,27901,27902,53766,53768,57992,58225</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19342489$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zheng, Jianbiao</creatorcontrib><creatorcontrib>Moorhead, Martin</creatorcontrib><creatorcontrib>Weng, Li</creatorcontrib><creatorcontrib>Siddiqui, Farooq</creatorcontrib><creatorcontrib>Carlton, Victoria E.H</creatorcontrib><creatorcontrib>Ireland, James S</creatorcontrib><creatorcontrib>Lee, Liana</creatorcontrib><creatorcontrib>Peterson, Joseph</creatorcontrib><creatorcontrib>Wilkins, Jennifer</creatorcontrib><creatorcontrib>Lin, Sean</creatorcontrib><creatorcontrib>Kan, Zhengyan</creatorcontrib><creatorcontrib>Seshagiri, Somasekar</creatorcontrib><creatorcontrib>Davis, Ronald W</creatorcontrib><creatorcontrib>Faham, Malek</creatorcontrib><title>High-throughput, high-accuracy array-based resequencing</title><title>Proceedings of the National Academy of Sciences - PNAS</title><addtitle>Proc Natl Acad Sci U S A</addtitle><description>Although genomewide association studies have successfully identified associations of many common single-nucleotide polymorphisms (SNPs) with common diseases, the SNPs implicated so far account for only a small proportion of the genetic variability of tested diseases. It has been suggested that common diseases may often be caused by rare alleles missed by genomewide association studies. To identify these rare alleles we need high-throughput, high-accuracy resequencing technologies. Although array-based genotyping has allowed genomewide association studies of common SNPs in tens of thousands of samples, array-based resequencing has been limited for 2 main reasons: the lack of a fully multiplexed pipeline for high-throughput sample processing, and failure to achieve sufficient performance. We have recently solved both of these problems and created a fully multiplexed high-throughput pipeline that results in high-quality data. The pipeline consists of target amplification from genomic DNA, followed by allele enrichment to generate pools of purified variant (or nonvariant) DNA and ends with interrogation of purified DNA on resequencing arrays. We have used this pipeline to resequence [almost equal to]5 Mb of DNA (on 3 arrays) corresponding to the exons of 1,500 genes in >473 samples; in total >2,350 Mb were sequenced. In the context of this large-scale study we obtained a false positive rate of [almost equal to]1 in 500,000 bp and a false negative rate of [almost equal to]10%.</description><subject>Alleles</subject><subject>Automation</subject><subject>Base Pair Mismatch</subject><subject>Biological Sciences</subject><subject>Complex Systems: From Chemistry to Systems Biology Special Feature</subject><subject>Correlation analysis</subject><subject>Deoxyribonucleic acid</subject><subject>Diabetes</subject><subject>Disease</subject><subject>DNA</subject><subject>DNA probes</subject><subject>False positive errors</subject><subject>Gene expression</subject><subject>Genetic diseases</subject><subject>Genome, Human - genetics</subject><subject>Genomics</subject><subject>Genotype & phenotype</subject><subject>Humans</subject><subject>Mutation - genetics</subject><subject>Oligonucleotide Array Sequence Analysis</subject><subject>Pipelines</subject><subject>Polymorphism</subject><subject>Ratio analysis</subject><subject>ROC Curve</subject><subject>Sequence Analysis, DNA - methods</subject><subject>Sequence Analysis, DNA - standards</subject><subject>Sequencing</subject><issn>0027-8424</issn><issn>1091-6490</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kc1v1DAQxS0EokvhzAlYcag4kHbGdpz4goQqoEiVOEDP1sRxPlbZeLET1P3v8WqjLnDgZGnmN8_v6TH2EuESoRBXu5HiJWhADRxBPWIrBI2ZkhoesxUAL7JScnnGnsW4AQCdl_CUnaEWaVrqFStu-rbLpi74ue128_R-3R0GZO0cyO7XFALts4qiq9fBRfdzdqPtx_Y5e9LQEN2L5T1nd58__bi-yW6_ffl6_fE2sznHKRNgXcFzDg6ELCvKlVA11oWkIleKK5lrlFznTleNs0KKUouKHJFESEe1OGcfjrq7udq62rpxCjSYXei3FPbGU2_-3ox9Z1r_y3CV_hUqCVwsAsEn83Ey2z5aNww0Oj9HkzDUGosEvv0H3Pg5jCmc4YASNSqeoKsjZIOPMbjmwQmCOTRiDo2YUyPp4vWfAU78UkEC3i3A4fIkpwyq5A65aeZhmNz9lNA3_0cT8epIbOLkwwMiQZb86H9RaMgbakMfzd33FE9A2vJcSfEb4zqxLA</recordid><startdate>20090421</startdate><enddate>20090421</enddate><creator>Zheng, Jianbiao</creator><creator>Moorhead, Martin</creator><creator>Weng, Li</creator><creator>Siddiqui, Farooq</creator><creator>Carlton, Victoria E.H</creator><creator>Ireland, James S</creator><creator>Lee, Liana</creator><creator>Peterson, Joseph</creator><creator>Wilkins, Jennifer</creator><creator>Lin, Sean</creator><creator>Kan, Zhengyan</creator><creator>Seshagiri, Somasekar</creator><creator>Davis, Ronald W</creator><creator>Faham, Malek</creator><general>National Academy of Sciences</general><general>National Acad Sciences</general><scope>FBQ</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QG</scope><scope>7QL</scope><scope>7QP</scope><scope>7QR</scope><scope>7SN</scope><scope>7SS</scope><scope>7T5</scope><scope>7TK</scope><scope>7TM</scope><scope>7TO</scope><scope>7U9</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>H94</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20090421</creationdate><title>High-throughput, high-accuracy array-based resequencing</title><author>Zheng, Jianbiao ; Moorhead, Martin ; Weng, Li ; Siddiqui, Farooq ; Carlton, Victoria E.H ; Ireland, James S ; Lee, Liana ; Peterson, Joseph ; Wilkins, Jennifer ; Lin, Sean ; Kan, Zhengyan ; Seshagiri, Somasekar ; Davis, Ronald W ; Faham, Malek</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c521t-30ce72520e0348ba5636d1d74a75662645914295e9bfec343893baeaa41020ed3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Alleles</topic><topic>Automation</topic><topic>Base Pair Mismatch</topic><topic>Biological Sciences</topic><topic>Complex Systems: From Chemistry to Systems Biology Special Feature</topic><topic>Correlation analysis</topic><topic>Deoxyribonucleic acid</topic><topic>Diabetes</topic><topic>Disease</topic><topic>DNA</topic><topic>DNA probes</topic><topic>False positive errors</topic><topic>Gene expression</topic><topic>Genetic diseases</topic><topic>Genome, Human - genetics</topic><topic>Genomics</topic><topic>Genotype & phenotype</topic><topic>Humans</topic><topic>Mutation - genetics</topic><topic>Oligonucleotide Array Sequence Analysis</topic><topic>Pipelines</topic><topic>Polymorphism</topic><topic>Ratio analysis</topic><topic>ROC Curve</topic><topic>Sequence Analysis, DNA - methods</topic><topic>Sequence Analysis, DNA - standards</topic><topic>Sequencing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zheng, Jianbiao</creatorcontrib><creatorcontrib>Moorhead, Martin</creatorcontrib><creatorcontrib>Weng, Li</creatorcontrib><creatorcontrib>Siddiqui, Farooq</creatorcontrib><creatorcontrib>Carlton, Victoria E.H</creatorcontrib><creatorcontrib>Ireland, James S</creatorcontrib><creatorcontrib>Lee, Liana</creatorcontrib><creatorcontrib>Peterson, Joseph</creatorcontrib><creatorcontrib>Wilkins, Jennifer</creatorcontrib><creatorcontrib>Lin, Sean</creatorcontrib><creatorcontrib>Kan, Zhengyan</creatorcontrib><creatorcontrib>Seshagiri, Somasekar</creatorcontrib><creatorcontrib>Davis, Ronald W</creatorcontrib><creatorcontrib>Faham, Malek</creatorcontrib><collection>AGRIS</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Calcium & Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Immunology Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zheng, Jianbiao</au><au>Moorhead, Martin</au><au>Weng, Li</au><au>Siddiqui, Farooq</au><au>Carlton, Victoria E.H</au><au>Ireland, James S</au><au>Lee, Liana</au><au>Peterson, Joseph</au><au>Wilkins, Jennifer</au><au>Lin, Sean</au><au>Kan, Zhengyan</au><au>Seshagiri, Somasekar</au><au>Davis, Ronald W</au><au>Faham, Malek</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>High-throughput, high-accuracy array-based resequencing</atitle><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle><addtitle>Proc Natl Acad Sci U S A</addtitle><date>2009-04-21</date><risdate>2009</risdate><volume>106</volume><issue>16</issue><spage>6712</spage><epage>6717</epage><pages>6712-6717</pages><issn>0027-8424</issn><eissn>1091-6490</eissn><abstract>Although genomewide association studies have successfully identified associations of many common single-nucleotide polymorphisms (SNPs) with common diseases, the SNPs implicated so far account for only a small proportion of the genetic variability of tested diseases. It has been suggested that common diseases may often be caused by rare alleles missed by genomewide association studies. To identify these rare alleles we need high-throughput, high-accuracy resequencing technologies. Although array-based genotyping has allowed genomewide association studies of common SNPs in tens of thousands of samples, array-based resequencing has been limited for 2 main reasons: the lack of a fully multiplexed pipeline for high-throughput sample processing, and failure to achieve sufficient performance. We have recently solved both of these problems and created a fully multiplexed high-throughput pipeline that results in high-quality data. The pipeline consists of target amplification from genomic DNA, followed by allele enrichment to generate pools of purified variant (or nonvariant) DNA and ends with interrogation of purified DNA on resequencing arrays. We have used this pipeline to resequence [almost equal to]5 Mb of DNA (on 3 arrays) corresponding to the exons of 1,500 genes in >473 samples; in total >2,350 Mb were sequenced. In the context of this large-scale study we obtained a false positive rate of [almost equal to]1 in 500,000 bp and a false negative rate of [almost equal to]10%.</abstract><cop>United States</cop><pub>National Academy of Sciences</pub><pmid>19342489</pmid><doi>10.1073/pnas.0901902106</doi><tpages>6</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0027-8424 |
ispartof | Proceedings of the National Academy of Sciences - PNAS, 2009-04, Vol.106 (16), p.6712-6717 |
issn | 0027-8424 1091-6490 |
language | eng |
recordid | cdi_crossref_primary_10_1073_pnas_0901902106 |
source | Jstor Complete Legacy; MEDLINE; PubMed Central; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry |
subjects | Alleles Automation Base Pair Mismatch Biological Sciences Complex Systems: From Chemistry to Systems Biology Special Feature Correlation analysis Deoxyribonucleic acid Diabetes Disease DNA DNA probes False positive errors Gene expression Genetic diseases Genome, Human - genetics Genomics Genotype & phenotype Humans Mutation - genetics Oligonucleotide Array Sequence Analysis Pipelines Polymorphism Ratio analysis ROC Curve Sequence Analysis, DNA - methods Sequence Analysis, DNA - standards Sequencing |
title | High-throughput, high-accuracy array-based resequencing |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T14%3A53%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=High-throughput,%20high-accuracy%20array-based%20resequencing&rft.jtitle=Proceedings%20of%20the%20National%20Academy%20of%20Sciences%20-%20PNAS&rft.au=Zheng,%20Jianbiao&rft.date=2009-04-21&rft.volume=106&rft.issue=16&rft.spage=6712&rft.epage=6717&rft.pages=6712-6717&rft.issn=0027-8424&rft.eissn=1091-6490&rft_id=info:doi/10.1073/pnas.0901902106&rft_dat=%3Cjstor_cross%3E40482162%3C/jstor_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=201419162&rft_id=info:pmid/19342489&rft_jstor_id=40482162&rfr_iscdi=true |