Automating sequence-based detection and genotyping of SNPs from diploid samples

The detection of sequence variation, for which DNA sequencing has emerged as the most sensitive and automated approach, forms the basis of all genetic analysis. Here we describe and illustrate an algorithm that accurately detects and genotypes SNPs from fluorescence-based sequence data. Because the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature genetics 2006-03, Vol.38 (3), p.375-381
Hauptverfasser: Stephens, Matthew, Sloan, James S, Robertson, P D, Scheet, Paul, Nickerson, Deborah A
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 381
container_issue 3
container_start_page 375
container_title Nature genetics
container_volume 38
creator Stephens, Matthew
Sloan, James S
Robertson, P D
Scheet, Paul
Nickerson, Deborah A
description The detection of sequence variation, for which DNA sequencing has emerged as the most sensitive and automated approach, forms the basis of all genetic analysis. Here we describe and illustrate an algorithm that accurately detects and genotypes SNPs from fluorescence-based sequence data. Because the algorithm focuses particularly on detecting SNPs through the identification of heterozygous individuals, it is especially well suited to the detection of SNPs in diploid samples obtained after DNA amplification. It is substantially more accurate than existing approaches and, notably, provides a useful quantitative measure of its confidence in each potential SNP detected and in each genotype called. Calls assigned the highest confidence are sufficiently reliable to remove the need for manual review in several contexts. For example, for sequence data from 47–90 individuals sequenced on both the forward and reverse strands, the highest-confidence calls from our algorithm detected 93% of all SNPs and 100% of high-frequency SNPs, with no false positive SNPs identified and 99.9% genotyping accuracy. This algorithm is implemented in a software package, PolyPhred version 5.0, which is freely available for academic use.
doi_str_mv 10.1038/ng1746
format Article
fullrecord <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_864950487</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A183393858</galeid><sourcerecordid>A183393858</sourcerecordid><originalsourceid>FETCH-LOGICAL-c651t-2ca87a4def392f1692400e15395187dc42a8117777a7e4a4eeb76e7b04f745ac3</originalsourceid><addsrcrecordid>eNqN0luL1DAUAOAiintRf4FIUVR86Jpbk_RxWLwsLI646mvIpCelS5vUJAX332-GKTuMito8NCTfOclpT1E8wegMIyrfug4Lxu8Vx7hmvMICy_t5jjiuGKL8qDiJ8RohzBiSD4sjzFlDGSHHxXo1Jz_q1LuujPBjBmeg2ugIbdlCApN670rt2rID59PNtHXellefPsfSBj-WbT8Nvm_LqMdpgPioeGD1EOHx8j4tvr1_9_X8Y3W5_nBxvrqsDK9xqojRUmjWgqUNsZg3hCEEuKZNjaVoDSNaYizyowUwzQA2goPYIGYFq7Whp8XrXd4p-HzrmNTYRwPDoB34OSqZK6wRkyLLV3-VXPCG1Zz8E2KBMiVb-PwXeO3n4HK5ihDCKc0moxc71OkBVO-sT0GbbUa1wpLShspaZnX2B5VHC2NvvAPb5_WDgDcHAdkk-Jk6PceoLq6-_L9dfz-0S_Em-BgDWDWFftThRmGkth2mdh2W4bOl-HkzQrtnS0tl8HIBOho92KCd6ePeCYHJ9ty7Pxjzlusg7L_ib0c-3Umn0xzgLtWyfQtZkOmP</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>222633223</pqid></control><display><type>article</type><title>Automating sequence-based detection and genotyping of SNPs from diploid samples</title><source>MEDLINE</source><source>Nature Journals Online</source><source>SpringerLink Journals - AutoHoldings</source><creator>Stephens, Matthew ; Sloan, James S ; Robertson, P D ; Scheet, Paul ; Nickerson, Deborah A</creator><creatorcontrib>Stephens, Matthew ; Sloan, James S ; Robertson, P D ; Scheet, Paul ; Nickerson, Deborah A</creatorcontrib><description>The detection of sequence variation, for which DNA sequencing has emerged as the most sensitive and automated approach, forms the basis of all genetic analysis. Here we describe and illustrate an algorithm that accurately detects and genotypes SNPs from fluorescence-based sequence data. Because the algorithm focuses particularly on detecting SNPs through the identification of heterozygous individuals, it is especially well suited to the detection of SNPs in diploid samples obtained after DNA amplification. It is substantially more accurate than existing approaches and, notably, provides a useful quantitative measure of its confidence in each potential SNP detected and in each genotype called. Calls assigned the highest confidence are sufficiently reliable to remove the need for manual review in several contexts. For example, for sequence data from 47–90 individuals sequenced on both the forward and reverse strands, the highest-confidence calls from our algorithm detected 93% of all SNPs and 100% of high-frequency SNPs, with no false positive SNPs identified and 99.9% genotyping accuracy. This algorithm is implemented in a software package, PolyPhred version 5.0, which is freely available for academic use.</description><identifier>ISSN: 1061-4036</identifier><identifier>EISSN: 1546-1718</identifier><identifier>DOI: 10.1038/ng1746</identifier><identifier>PMID: 16493422</identifier><identifier>CODEN: NGENEC</identifier><language>eng</language><publisher>New York: Nature Publishing Group US</publisher><subject>Agriculture ; Algorithms ; Animal Genetics and Genomics ; Automation - methods ; Biological and medical sciences ; Biomedical and Life Sciences ; Biomedicine ; Cancer Research ; Chromosomes ; Deoxyribonucleic acid ; Diploidy ; DNA ; DNA - genetics ; Fluorescence ; Fundamental and applied biological sciences. Psychology ; Gene amplification ; Gene Function ; Genetic Variation ; Genetics of eukaryotes. Biological and molecular evolution ; Genotype ; Genotype &amp; phenotype ; Genotypes ; Human Genetics ; Physiological aspects ; Polymorphism, Single Nucleotide ; Reproducibility of Results ; Sensitivity and Specificity ; Single nucleotide polymorphisms ; technical-report</subject><ispartof>Nature genetics, 2006-03, Vol.38 (3), p.375-381</ispartof><rights>Springer Nature America, Inc. 2006</rights><rights>2006 INIST-CNRS</rights><rights>COPYRIGHT 2006 Nature Publishing Group</rights><rights>Copyright Nature Publishing Group Mar 2006</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c651t-2ca87a4def392f1692400e15395187dc42a8117777a7e4a4eeb76e7b04f745ac3</citedby><cites>FETCH-LOGICAL-c651t-2ca87a4def392f1692400e15395187dc42a8117777a7e4a4eeb76e7b04f745ac3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1038/ng1746$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1038/ng1746$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,2727,27924,27925,41488,42557,51319</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=17712183$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/16493422$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Stephens, Matthew</creatorcontrib><creatorcontrib>Sloan, James S</creatorcontrib><creatorcontrib>Robertson, P D</creatorcontrib><creatorcontrib>Scheet, Paul</creatorcontrib><creatorcontrib>Nickerson, Deborah A</creatorcontrib><title>Automating sequence-based detection and genotyping of SNPs from diploid samples</title><title>Nature genetics</title><addtitle>Nat Genet</addtitle><addtitle>Nat Genet</addtitle><description>The detection of sequence variation, for which DNA sequencing has emerged as the most sensitive and automated approach, forms the basis of all genetic analysis. Here we describe and illustrate an algorithm that accurately detects and genotypes SNPs from fluorescence-based sequence data. Because the algorithm focuses particularly on detecting SNPs through the identification of heterozygous individuals, it is especially well suited to the detection of SNPs in diploid samples obtained after DNA amplification. It is substantially more accurate than existing approaches and, notably, provides a useful quantitative measure of its confidence in each potential SNP detected and in each genotype called. Calls assigned the highest confidence are sufficiently reliable to remove the need for manual review in several contexts. For example, for sequence data from 47–90 individuals sequenced on both the forward and reverse strands, the highest-confidence calls from our algorithm detected 93% of all SNPs and 100% of high-frequency SNPs, with no false positive SNPs identified and 99.9% genotyping accuracy. This algorithm is implemented in a software package, PolyPhred version 5.0, which is freely available for academic use.</description><subject>Agriculture</subject><subject>Algorithms</subject><subject>Animal Genetics and Genomics</subject><subject>Automation - methods</subject><subject>Biological and medical sciences</subject><subject>Biomedical and Life Sciences</subject><subject>Biomedicine</subject><subject>Cancer Research</subject><subject>Chromosomes</subject><subject>Deoxyribonucleic acid</subject><subject>Diploidy</subject><subject>DNA</subject><subject>DNA - genetics</subject><subject>Fluorescence</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Gene amplification</subject><subject>Gene Function</subject><subject>Genetic Variation</subject><subject>Genetics of eukaryotes. Biological and molecular evolution</subject><subject>Genotype</subject><subject>Genotype &amp; phenotype</subject><subject>Genotypes</subject><subject>Human Genetics</subject><subject>Physiological aspects</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Reproducibility of Results</subject><subject>Sensitivity and Specificity</subject><subject>Single nucleotide polymorphisms</subject><subject>technical-report</subject><issn>1061-4036</issn><issn>1546-1718</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNqN0luL1DAUAOAiintRf4FIUVR86Jpbk_RxWLwsLI646mvIpCelS5vUJAX332-GKTuMito8NCTfOclpT1E8wegMIyrfug4Lxu8Vx7hmvMICy_t5jjiuGKL8qDiJ8RohzBiSD4sjzFlDGSHHxXo1Jz_q1LuujPBjBmeg2ugIbdlCApN670rt2rID59PNtHXellefPsfSBj-WbT8Nvm_LqMdpgPioeGD1EOHx8j4tvr1_9_X8Y3W5_nBxvrqsDK9xqojRUmjWgqUNsZg3hCEEuKZNjaVoDSNaYizyowUwzQA2goPYIGYFq7Whp8XrXd4p-HzrmNTYRwPDoB34OSqZK6wRkyLLV3-VXPCG1Zz8E2KBMiVb-PwXeO3n4HK5ihDCKc0moxc71OkBVO-sT0GbbUa1wpLShspaZnX2B5VHC2NvvAPb5_WDgDcHAdkk-Jk6PceoLq6-_L9dfz-0S_Em-BgDWDWFftThRmGkth2mdh2W4bOl-HkzQrtnS0tl8HIBOho92KCd6ePeCYHJ9ty7Pxjzlusg7L_ib0c-3Umn0xzgLtWyfQtZkOmP</recordid><startdate>20060301</startdate><enddate>20060301</enddate><creator>Stephens, Matthew</creator><creator>Sloan, James S</creator><creator>Robertson, P D</creator><creator>Scheet, Paul</creator><creator>Nickerson, Deborah A</creator><general>Nature Publishing Group US</general><general>Nature Publishing Group</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>IOV</scope><scope>ISR</scope><scope>3V.</scope><scope>7QL</scope><scope>7QP</scope><scope>7QR</scope><scope>7SS</scope><scope>7T7</scope><scope>7TK</scope><scope>7TM</scope><scope>7U9</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>8AO</scope><scope>8C1</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2O</scope><scope>M7N</scope><scope>M7P</scope><scope>MBDVC</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope></search><sort><creationdate>20060301</creationdate><title>Automating sequence-based detection and genotyping of SNPs from diploid samples</title><author>Stephens, Matthew ; Sloan, James S ; Robertson, P D ; Scheet, Paul ; Nickerson, Deborah A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c651t-2ca87a4def392f1692400e15395187dc42a8117777a7e4a4eeb76e7b04f745ac3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Agriculture</topic><topic>Algorithms</topic><topic>Animal Genetics and Genomics</topic><topic>Automation - methods</topic><topic>Biological and medical sciences</topic><topic>Biomedical and Life Sciences</topic><topic>Biomedicine</topic><topic>Cancer Research</topic><topic>Chromosomes</topic><topic>Deoxyribonucleic acid</topic><topic>Diploidy</topic><topic>DNA</topic><topic>DNA - genetics</topic><topic>Fluorescence</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Gene amplification</topic><topic>Gene Function</topic><topic>Genetic Variation</topic><topic>Genetics of eukaryotes. Biological and molecular evolution</topic><topic>Genotype</topic><topic>Genotype &amp; phenotype</topic><topic>Genotypes</topic><topic>Human Genetics</topic><topic>Physiological aspects</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Reproducibility of Results</topic><topic>Sensitivity and Specificity</topic><topic>Single nucleotide polymorphisms</topic><topic>technical-report</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Stephens, Matthew</creatorcontrib><creatorcontrib>Sloan, James S</creatorcontrib><creatorcontrib>Robertson, P D</creatorcontrib><creatorcontrib>Scheet, Paul</creatorcontrib><creatorcontrib>Nickerson, Deborah A</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Opposing Viewpoints</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Public Health Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Research Library</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Research Library (Corporate)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Nature genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Stephens, Matthew</au><au>Sloan, James S</au><au>Robertson, P D</au><au>Scheet, Paul</au><au>Nickerson, Deborah A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automating sequence-based detection and genotyping of SNPs from diploid samples</atitle><jtitle>Nature genetics</jtitle><stitle>Nat Genet</stitle><addtitle>Nat Genet</addtitle><date>2006-03-01</date><risdate>2006</risdate><volume>38</volume><issue>3</issue><spage>375</spage><epage>381</epage><pages>375-381</pages><issn>1061-4036</issn><eissn>1546-1718</eissn><coden>NGENEC</coden><abstract>The detection of sequence variation, for which DNA sequencing has emerged as the most sensitive and automated approach, forms the basis of all genetic analysis. Here we describe and illustrate an algorithm that accurately detects and genotypes SNPs from fluorescence-based sequence data. Because the algorithm focuses particularly on detecting SNPs through the identification of heterozygous individuals, it is especially well suited to the detection of SNPs in diploid samples obtained after DNA amplification. It is substantially more accurate than existing approaches and, notably, provides a useful quantitative measure of its confidence in each potential SNP detected and in each genotype called. Calls assigned the highest confidence are sufficiently reliable to remove the need for manual review in several contexts. For example, for sequence data from 47–90 individuals sequenced on both the forward and reverse strands, the highest-confidence calls from our algorithm detected 93% of all SNPs and 100% of high-frequency SNPs, with no false positive SNPs identified and 99.9% genotyping accuracy. This algorithm is implemented in a software package, PolyPhred version 5.0, which is freely available for academic use.</abstract><cop>New York</cop><pub>Nature Publishing Group US</pub><pmid>16493422</pmid><doi>10.1038/ng1746</doi><tpages>7</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1061-4036
ispartof Nature genetics, 2006-03, Vol.38 (3), p.375-381
issn 1061-4036
1546-1718
language eng
recordid cdi_proquest_miscellaneous_864950487
source MEDLINE; Nature Journals Online; SpringerLink Journals - AutoHoldings
subjects Agriculture
Algorithms
Animal Genetics and Genomics
Automation - methods
Biological and medical sciences
Biomedical and Life Sciences
Biomedicine
Cancer Research
Chromosomes
Deoxyribonucleic acid
Diploidy
DNA
DNA - genetics
Fluorescence
Fundamental and applied biological sciences. Psychology
Gene amplification
Gene Function
Genetic Variation
Genetics of eukaryotes. Biological and molecular evolution
Genotype
Genotype & phenotype
Genotypes
Human Genetics
Physiological aspects
Polymorphism, Single Nucleotide
Reproducibility of Results
Sensitivity and Specificity
Single nucleotide polymorphisms
technical-report
title Automating sequence-based detection and genotyping of SNPs from diploid samples
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T05%3A13%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automating%20sequence-based%20detection%20and%20genotyping%20of%20SNPs%20from%20diploid%20samples&rft.jtitle=Nature%20genetics&rft.au=Stephens,%20Matthew&rft.date=2006-03-01&rft.volume=38&rft.issue=3&rft.spage=375&rft.epage=381&rft.pages=375-381&rft.issn=1061-4036&rft.eissn=1546-1718&rft.coden=NGENEC&rft_id=info:doi/10.1038/ng1746&rft_dat=%3Cgale_proqu%3EA183393858%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=222633223&rft_id=info:pmid/16493422&rft_galeid=A183393858&rfr_iscdi=true