Automating sequence-based detection and genotyping of SNPs from diploid samples
The detection of sequence variation, for which DNA sequencing has emerged as the most sensitive and automated approach, forms the basis of all genetic analysis. Here we describe and illustrate an algorithm that accurately detects and genotypes SNPs from fluorescence-based sequence data. Because the...
Gespeichert in:
Veröffentlicht in: | Nature genetics 2006-03, Vol.38 (3), p.375-381 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 381 |
---|---|
container_issue | 3 |
container_start_page | 375 |
container_title | Nature genetics |
container_volume | 38 |
creator | Stephens, Matthew Sloan, James S Robertson, P D Scheet, Paul Nickerson, Deborah A |
description | The detection of sequence variation, for which DNA sequencing has emerged as the most sensitive and automated approach, forms the basis of all genetic analysis. Here we describe and illustrate an algorithm that accurately detects and genotypes SNPs from fluorescence-based sequence data. Because the algorithm focuses particularly on detecting SNPs through the identification of heterozygous individuals, it is especially well suited to the detection of SNPs in diploid samples obtained after DNA amplification. It is substantially more accurate than existing approaches and, notably, provides a useful quantitative measure of its confidence in each potential SNP detected and in each genotype called. Calls assigned the highest confidence are sufficiently reliable to remove the need for manual review in several contexts. For example, for sequence data from 47–90 individuals sequenced on both the forward and reverse strands, the highest-confidence calls from our algorithm detected 93% of all SNPs and 100% of high-frequency SNPs, with no false positive SNPs identified and 99.9% genotyping accuracy. This algorithm is implemented in a software package, PolyPhred version 5.0, which is freely available for academic use. |
doi_str_mv | 10.1038/ng1746 |
format | Article |
fullrecord | <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_864950487</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A183393858</galeid><sourcerecordid>A183393858</sourcerecordid><originalsourceid>FETCH-LOGICAL-c651t-2ca87a4def392f1692400e15395187dc42a8117777a7e4a4eeb76e7b04f745ac3</originalsourceid><addsrcrecordid>eNqN0luL1DAUAOAiintRf4FIUVR86Jpbk_RxWLwsLI646mvIpCelS5vUJAX332-GKTuMito8NCTfOclpT1E8wegMIyrfug4Lxu8Vx7hmvMICy_t5jjiuGKL8qDiJ8RohzBiSD4sjzFlDGSHHxXo1Jz_q1LuujPBjBmeg2ugIbdlCApN670rt2rID59PNtHXellefPsfSBj-WbT8Nvm_LqMdpgPioeGD1EOHx8j4tvr1_9_X8Y3W5_nBxvrqsDK9xqojRUmjWgqUNsZg3hCEEuKZNjaVoDSNaYizyowUwzQA2goPYIGYFq7Whp8XrXd4p-HzrmNTYRwPDoB34OSqZK6wRkyLLV3-VXPCG1Zz8E2KBMiVb-PwXeO3n4HK5ihDCKc0moxc71OkBVO-sT0GbbUa1wpLShspaZnX2B5VHC2NvvAPb5_WDgDcHAdkk-Jk6PceoLq6-_L9dfz-0S_Em-BgDWDWFftThRmGkth2mdh2W4bOl-HkzQrtnS0tl8HIBOho92KCd6ePeCYHJ9ty7Pxjzlusg7L_ib0c-3Umn0xzgLtWyfQtZkOmP</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>222633223</pqid></control><display><type>article</type><title>Automating sequence-based detection and genotyping of SNPs from diploid samples</title><source>MEDLINE</source><source>Nature Journals Online</source><source>SpringerLink Journals - AutoHoldings</source><creator>Stephens, Matthew ; Sloan, James S ; Robertson, P D ; Scheet, Paul ; Nickerson, Deborah A</creator><creatorcontrib>Stephens, Matthew ; Sloan, James S ; Robertson, P D ; Scheet, Paul ; Nickerson, Deborah A</creatorcontrib><description>The detection of sequence variation, for which DNA sequencing has emerged as the most sensitive and automated approach, forms the basis of all genetic analysis. Here we describe and illustrate an algorithm that accurately detects and genotypes SNPs from fluorescence-based sequence data. Because the algorithm focuses particularly on detecting SNPs through the identification of heterozygous individuals, it is especially well suited to the detection of SNPs in diploid samples obtained after DNA amplification. It is substantially more accurate than existing approaches and, notably, provides a useful quantitative measure of its confidence in each potential SNP detected and in each genotype called. Calls assigned the highest confidence are sufficiently reliable to remove the need for manual review in several contexts. For example, for sequence data from 47–90 individuals sequenced on both the forward and reverse strands, the highest-confidence calls from our algorithm detected 93% of all SNPs and 100% of high-frequency SNPs, with no false positive SNPs identified and 99.9% genotyping accuracy. This algorithm is implemented in a software package, PolyPhred version 5.0, which is freely available for academic use.</description><identifier>ISSN: 1061-4036</identifier><identifier>EISSN: 1546-1718</identifier><identifier>DOI: 10.1038/ng1746</identifier><identifier>PMID: 16493422</identifier><identifier>CODEN: NGENEC</identifier><language>eng</language><publisher>New York: Nature Publishing Group US</publisher><subject>Agriculture ; Algorithms ; Animal Genetics and Genomics ; Automation - methods ; Biological and medical sciences ; Biomedical and Life Sciences ; Biomedicine ; Cancer Research ; Chromosomes ; Deoxyribonucleic acid ; Diploidy ; DNA ; DNA - genetics ; Fluorescence ; Fundamental and applied biological sciences. Psychology ; Gene amplification ; Gene Function ; Genetic Variation ; Genetics of eukaryotes. Biological and molecular evolution ; Genotype ; Genotype & phenotype ; Genotypes ; Human Genetics ; Physiological aspects ; Polymorphism, Single Nucleotide ; Reproducibility of Results ; Sensitivity and Specificity ; Single nucleotide polymorphisms ; technical-report</subject><ispartof>Nature genetics, 2006-03, Vol.38 (3), p.375-381</ispartof><rights>Springer Nature America, Inc. 2006</rights><rights>2006 INIST-CNRS</rights><rights>COPYRIGHT 2006 Nature Publishing Group</rights><rights>Copyright Nature Publishing Group Mar 2006</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c651t-2ca87a4def392f1692400e15395187dc42a8117777a7e4a4eeb76e7b04f745ac3</citedby><cites>FETCH-LOGICAL-c651t-2ca87a4def392f1692400e15395187dc42a8117777a7e4a4eeb76e7b04f745ac3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1038/ng1746$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1038/ng1746$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,2727,27924,27925,41488,42557,51319</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=17712183$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/16493422$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Stephens, Matthew</creatorcontrib><creatorcontrib>Sloan, James S</creatorcontrib><creatorcontrib>Robertson, P D</creatorcontrib><creatorcontrib>Scheet, Paul</creatorcontrib><creatorcontrib>Nickerson, Deborah A</creatorcontrib><title>Automating sequence-based detection and genotyping of SNPs from diploid samples</title><title>Nature genetics</title><addtitle>Nat Genet</addtitle><addtitle>Nat Genet</addtitle><description>The detection of sequence variation, for which DNA sequencing has emerged as the most sensitive and automated approach, forms the basis of all genetic analysis. Here we describe and illustrate an algorithm that accurately detects and genotypes SNPs from fluorescence-based sequence data. Because the algorithm focuses particularly on detecting SNPs through the identification of heterozygous individuals, it is especially well suited to the detection of SNPs in diploid samples obtained after DNA amplification. It is substantially more accurate than existing approaches and, notably, provides a useful quantitative measure of its confidence in each potential SNP detected and in each genotype called. Calls assigned the highest confidence are sufficiently reliable to remove the need for manual review in several contexts. For example, for sequence data from 47–90 individuals sequenced on both the forward and reverse strands, the highest-confidence calls from our algorithm detected 93% of all SNPs and 100% of high-frequency SNPs, with no false positive SNPs identified and 99.9% genotyping accuracy. This algorithm is implemented in a software package, PolyPhred version 5.0, which is freely available for academic use.</description><subject>Agriculture</subject><subject>Algorithms</subject><subject>Animal Genetics and Genomics</subject><subject>Automation - methods</subject><subject>Biological and medical sciences</subject><subject>Biomedical and Life Sciences</subject><subject>Biomedicine</subject><subject>Cancer Research</subject><subject>Chromosomes</subject><subject>Deoxyribonucleic acid</subject><subject>Diploidy</subject><subject>DNA</subject><subject>DNA - genetics</subject><subject>Fluorescence</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Gene amplification</subject><subject>Gene Function</subject><subject>Genetic Variation</subject><subject>Genetics of eukaryotes. Biological and molecular evolution</subject><subject>Genotype</subject><subject>Genotype & phenotype</subject><subject>Genotypes</subject><subject>Human Genetics</subject><subject>Physiological aspects</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Reproducibility of Results</subject><subject>Sensitivity and Specificity</subject><subject>Single nucleotide polymorphisms</subject><subject>technical-report</subject><issn>1061-4036</issn><issn>1546-1718</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNqN0luL1DAUAOAiintRf4FIUVR86Jpbk_RxWLwsLI646mvIpCelS5vUJAX332-GKTuMito8NCTfOclpT1E8wegMIyrfug4Lxu8Vx7hmvMICy_t5jjiuGKL8qDiJ8RohzBiSD4sjzFlDGSHHxXo1Jz_q1LuujPBjBmeg2ugIbdlCApN670rt2rID59PNtHXellefPsfSBj-WbT8Nvm_LqMdpgPioeGD1EOHx8j4tvr1_9_X8Y3W5_nBxvrqsDK9xqojRUmjWgqUNsZg3hCEEuKZNjaVoDSNaYizyowUwzQA2goPYIGYFq7Whp8XrXd4p-HzrmNTYRwPDoB34OSqZK6wRkyLLV3-VXPCG1Zz8E2KBMiVb-PwXeO3n4HK5ihDCKc0moxc71OkBVO-sT0GbbUa1wpLShspaZnX2B5VHC2NvvAPb5_WDgDcHAdkk-Jk6PceoLq6-_L9dfz-0S_Em-BgDWDWFftThRmGkth2mdh2W4bOl-HkzQrtnS0tl8HIBOho92KCd6ePeCYHJ9ty7Pxjzlusg7L_ib0c-3Umn0xzgLtWyfQtZkOmP</recordid><startdate>20060301</startdate><enddate>20060301</enddate><creator>Stephens, Matthew</creator><creator>Sloan, James S</creator><creator>Robertson, P D</creator><creator>Scheet, Paul</creator><creator>Nickerson, Deborah A</creator><general>Nature Publishing Group US</general><general>Nature Publishing Group</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>IOV</scope><scope>ISR</scope><scope>3V.</scope><scope>7QL</scope><scope>7QP</scope><scope>7QR</scope><scope>7SS</scope><scope>7T7</scope><scope>7TK</scope><scope>7TM</scope><scope>7U9</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>8AO</scope><scope>8C1</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2O</scope><scope>M7N</scope><scope>M7P</scope><scope>MBDVC</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope></search><sort><creationdate>20060301</creationdate><title>Automating sequence-based detection and genotyping of SNPs from diploid samples</title><author>Stephens, Matthew ; Sloan, James S ; Robertson, P D ; Scheet, Paul ; Nickerson, Deborah A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c651t-2ca87a4def392f1692400e15395187dc42a8117777a7e4a4eeb76e7b04f745ac3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Agriculture</topic><topic>Algorithms</topic><topic>Animal Genetics and Genomics</topic><topic>Automation - methods</topic><topic>Biological and medical sciences</topic><topic>Biomedical and Life Sciences</topic><topic>Biomedicine</topic><topic>Cancer Research</topic><topic>Chromosomes</topic><topic>Deoxyribonucleic acid</topic><topic>Diploidy</topic><topic>DNA</topic><topic>DNA - genetics</topic><topic>Fluorescence</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Gene amplification</topic><topic>Gene Function</topic><topic>Genetic Variation</topic><topic>Genetics of eukaryotes. Biological and molecular evolution</topic><topic>Genotype</topic><topic>Genotype & phenotype</topic><topic>Genotypes</topic><topic>Human Genetics</topic><topic>Physiological aspects</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Reproducibility of Results</topic><topic>Sensitivity and Specificity</topic><topic>Single nucleotide polymorphisms</topic><topic>technical-report</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Stephens, Matthew</creatorcontrib><creatorcontrib>Sloan, James S</creatorcontrib><creatorcontrib>Robertson, P D</creatorcontrib><creatorcontrib>Scheet, Paul</creatorcontrib><creatorcontrib>Nickerson, Deborah A</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Opposing Viewpoints</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Calcium & Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Public Health Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Research Library</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Research Library (Corporate)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Nature genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Stephens, Matthew</au><au>Sloan, James S</au><au>Robertson, P D</au><au>Scheet, Paul</au><au>Nickerson, Deborah A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automating sequence-based detection and genotyping of SNPs from diploid samples</atitle><jtitle>Nature genetics</jtitle><stitle>Nat Genet</stitle><addtitle>Nat Genet</addtitle><date>2006-03-01</date><risdate>2006</risdate><volume>38</volume><issue>3</issue><spage>375</spage><epage>381</epage><pages>375-381</pages><issn>1061-4036</issn><eissn>1546-1718</eissn><coden>NGENEC</coden><abstract>The detection of sequence variation, for which DNA sequencing has emerged as the most sensitive and automated approach, forms the basis of all genetic analysis. Here we describe and illustrate an algorithm that accurately detects and genotypes SNPs from fluorescence-based sequence data. Because the algorithm focuses particularly on detecting SNPs through the identification of heterozygous individuals, it is especially well suited to the detection of SNPs in diploid samples obtained after DNA amplification. It is substantially more accurate than existing approaches and, notably, provides a useful quantitative measure of its confidence in each potential SNP detected and in each genotype called. Calls assigned the highest confidence are sufficiently reliable to remove the need for manual review in several contexts. For example, for sequence data from 47–90 individuals sequenced on both the forward and reverse strands, the highest-confidence calls from our algorithm detected 93% of all SNPs and 100% of high-frequency SNPs, with no false positive SNPs identified and 99.9% genotyping accuracy. This algorithm is implemented in a software package, PolyPhred version 5.0, which is freely available for academic use.</abstract><cop>New York</cop><pub>Nature Publishing Group US</pub><pmid>16493422</pmid><doi>10.1038/ng1746</doi><tpages>7</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1061-4036 |
ispartof | Nature genetics, 2006-03, Vol.38 (3), p.375-381 |
issn | 1061-4036 1546-1718 |
language | eng |
recordid | cdi_proquest_miscellaneous_864950487 |
source | MEDLINE; Nature Journals Online; SpringerLink Journals - AutoHoldings |
subjects | Agriculture Algorithms Animal Genetics and Genomics Automation - methods Biological and medical sciences Biomedical and Life Sciences Biomedicine Cancer Research Chromosomes Deoxyribonucleic acid Diploidy DNA DNA - genetics Fluorescence Fundamental and applied biological sciences. Psychology Gene amplification Gene Function Genetic Variation Genetics of eukaryotes. Biological and molecular evolution Genotype Genotype & phenotype Genotypes Human Genetics Physiological aspects Polymorphism, Single Nucleotide Reproducibility of Results Sensitivity and Specificity Single nucleotide polymorphisms technical-report |
title | Automating sequence-based detection and genotyping of SNPs from diploid samples |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T05%3A13%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automating%20sequence-based%20detection%20and%20genotyping%20of%20SNPs%20from%20diploid%20samples&rft.jtitle=Nature%20genetics&rft.au=Stephens,%20Matthew&rft.date=2006-03-01&rft.volume=38&rft.issue=3&rft.spage=375&rft.epage=381&rft.pages=375-381&rft.issn=1061-4036&rft.eissn=1546-1718&rft.coden=NGENEC&rft_id=info:doi/10.1038/ng1746&rft_dat=%3Cgale_proqu%3EA183393858%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=222633223&rft_id=info:pmid/16493422&rft_galeid=A183393858&rfr_iscdi=true |