Quantifying population genetic differentiation from next-generation sequencing data
Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data....
Gespeichert in:
Veröffentlicht in: | Genetics (Austin) 2013-11, Vol.195 (3), p.979-992 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 992 |
---|---|
container_issue | 3 |
container_start_page | 979 |
container_title | Genetics (Austin) |
container_volume | 195 |
creator | Fumagalli, Matteo Vieira, Filipe G Korneliussen, Thorfinn Sand Linderoth, Tyler Huerta-Sánchez, Emilia Albrechtsen, Anders Nielsen, Rasmus |
description | Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage. |
doi_str_mv | 10.1534/genetics.113.154740 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3813878</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3143693041</sourcerecordid><originalsourceid>FETCH-LOGICAL-c466t-de3a246185c89297a0b63722569b3542ce77c2a3a65a7c4ba8f814cc36d69bdc3</originalsourceid><addsrcrecordid>eNqFkU1LAzEQhoMo1q9fIEjBi5etm0w2yV4EKX6BIKKeQ5rN1pRtUpNdsf_elG2LevGUYeaZl3fyInSK8xEugF5OjTOt1XGEMaQO5TTfQQe4pJARBnj3Rz1AhzHO8jxnZSH20YBAyVNFD9DLc6dca-ulddPhwi-6RrXWu-FafFjZujbBJKTv18HPh858tdmKCH0zmo_OOL2SqFSrjtFerZpoTtbvEXq7vXkd32ePT3cP4-vHTFPG2qwyoAhlWBRalKTkKp8w4IQUrJxAQYk2nGuiQLFCcU0nStQCU62BVYmoNByhq1530U3mptLJZVCNXAQ7V2EpvbLy98TZdzn1nxIEBsFFErhYCwSfLoitnNuoTdMoZ3wXJS7yAkiZzP2PUloSLgiQhJ7_QWe-Cy79RKIYCCYEXglCT-ngYwym3vrGuVzlKzf5ypSv7PNNW2c_T97ubAKFb6Y5pRA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1463868819</pqid></control><display><type>article</type><title>Quantifying population genetic differentiation from next-generation sequencing data</title><source>Electronic Journals Library</source><source>MEDLINE</source><source>Oxford University Press Journals</source><source>Alma/SFX Local Collection</source><creator>Fumagalli, Matteo ; Vieira, Filipe G ; Korneliussen, Thorfinn Sand ; Linderoth, Tyler ; Huerta-Sánchez, Emilia ; Albrechtsen, Anders ; Nielsen, Rasmus</creator><creatorcontrib>Fumagalli, Matteo ; Vieira, Filipe G ; Korneliussen, Thorfinn Sand ; Linderoth, Tyler ; Huerta-Sánchez, Emilia ; Albrechtsen, Anders ; Nielsen, Rasmus</creatorcontrib><description>Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.</description><identifier>ISSN: 1943-2631</identifier><identifier>ISSN: 0016-6731</identifier><identifier>EISSN: 1943-2631</identifier><identifier>DOI: 10.1534/genetics.113.154740</identifier><identifier>PMID: 23979584</identifier><identifier>CODEN: GENTAE</identifier><language>eng</language><publisher>United States: Genetics Society of America</publisher><subject>Animals ; Bombyx - genetics ; Bombyx mori ; Cellular biology ; Computational Biology ; Computer Simulation ; Data Interpretation, Statistical ; Deoxyribonucleic acid ; DNA ; Genetic Drift ; Genetic Variation ; Genetics, Population - statistics & numerical data ; Genotype ; High-Throughput Nucleotide Sequencing - statistics & numerical data ; Investigations ; Likelihood Functions ; Models, Genetic ; Mutation ; Population genetics ; Principal Component Analysis ; Selection, Genetic</subject><ispartof>Genetics (Austin), 2013-11, Vol.195 (3), p.979-992</ispartof><rights>Copyright Genetics Society of America Nov 2013</rights><rights>Copyright © 2013 by the Genetics Society of America 2013</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c466t-de3a246185c89297a0b63722569b3542ce77c2a3a65a7c4ba8f814cc36d69bdc3</citedby><cites>FETCH-LOGICAL-c466t-de3a246185c89297a0b63722569b3542ce77c2a3a65a7c4ba8f814cc36d69bdc3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,777,781,882,27905,27906</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23979584$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Fumagalli, Matteo</creatorcontrib><creatorcontrib>Vieira, Filipe G</creatorcontrib><creatorcontrib>Korneliussen, Thorfinn Sand</creatorcontrib><creatorcontrib>Linderoth, Tyler</creatorcontrib><creatorcontrib>Huerta-Sánchez, Emilia</creatorcontrib><creatorcontrib>Albrechtsen, Anders</creatorcontrib><creatorcontrib>Nielsen, Rasmus</creatorcontrib><title>Quantifying population genetic differentiation from next-generation sequencing data</title><title>Genetics (Austin)</title><addtitle>Genetics</addtitle><description>Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.</description><subject>Animals</subject><subject>Bombyx - genetics</subject><subject>Bombyx mori</subject><subject>Cellular biology</subject><subject>Computational Biology</subject><subject>Computer Simulation</subject><subject>Data Interpretation, Statistical</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>Genetic Drift</subject><subject>Genetic Variation</subject><subject>Genetics, Population - statistics & numerical data</subject><subject>Genotype</subject><subject>High-Throughput Nucleotide Sequencing - statistics & numerical data</subject><subject>Investigations</subject><subject>Likelihood Functions</subject><subject>Models, Genetic</subject><subject>Mutation</subject><subject>Population genetics</subject><subject>Principal Component Analysis</subject><subject>Selection, Genetic</subject><issn>1943-2631</issn><issn>0016-6731</issn><issn>1943-2631</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNqFkU1LAzEQhoMo1q9fIEjBi5etm0w2yV4EKX6BIKKeQ5rN1pRtUpNdsf_elG2LevGUYeaZl3fyInSK8xEugF5OjTOt1XGEMaQO5TTfQQe4pJARBnj3Rz1AhzHO8jxnZSH20YBAyVNFD9DLc6dca-ulddPhwi-6RrXWu-FafFjZujbBJKTv18HPh858tdmKCH0zmo_OOL2SqFSrjtFerZpoTtbvEXq7vXkd32ePT3cP4-vHTFPG2qwyoAhlWBRalKTkKp8w4IQUrJxAQYk2nGuiQLFCcU0nStQCU62BVYmoNByhq1530U3mptLJZVCNXAQ7V2EpvbLy98TZdzn1nxIEBsFFErhYCwSfLoitnNuoTdMoZ3wXJS7yAkiZzP2PUloSLgiQhJ7_QWe-Cy79RKIYCCYEXglCT-ngYwym3vrGuVzlKzf5ypSv7PNNW2c_T97ubAKFb6Y5pRA</recordid><startdate>20131101</startdate><enddate>20131101</enddate><creator>Fumagalli, Matteo</creator><creator>Vieira, Filipe G</creator><creator>Korneliussen, Thorfinn Sand</creator><creator>Linderoth, Tyler</creator><creator>Huerta-Sánchez, Emilia</creator><creator>Albrechtsen, Anders</creator><creator>Nielsen, Rasmus</creator><general>Genetics Society of America</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>4T-</scope><scope>4U-</scope><scope>7QP</scope><scope>7SS</scope><scope>7TK</scope><scope>7TM</scope><scope>7X2</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8AO</scope><scope>8C1</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>K9-</scope><scope>K9.</scope><scope>LK8</scope><scope>M0K</scope><scope>M0R</scope><scope>M0S</scope><scope>M1P</scope><scope>M2O</scope><scope>M2P</scope><scope>M7N</scope><scope>M7P</scope><scope>MBDVC</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20131101</creationdate><title>Quantifying population genetic differentiation from next-generation sequencing data</title><author>Fumagalli, Matteo ; Vieira, Filipe G ; Korneliussen, Thorfinn Sand ; Linderoth, Tyler ; Huerta-Sánchez, Emilia ; Albrechtsen, Anders ; Nielsen, Rasmus</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c466t-de3a246185c89297a0b63722569b3542ce77c2a3a65a7c4ba8f814cc36d69bdc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Animals</topic><topic>Bombyx - genetics</topic><topic>Bombyx mori</topic><topic>Cellular biology</topic><topic>Computational Biology</topic><topic>Computer Simulation</topic><topic>Data Interpretation, Statistical</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>Genetic Drift</topic><topic>Genetic Variation</topic><topic>Genetics, Population - statistics & numerical data</topic><topic>Genotype</topic><topic>High-Throughput Nucleotide Sequencing - statistics & numerical data</topic><topic>Investigations</topic><topic>Likelihood Functions</topic><topic>Models, Genetic</topic><topic>Mutation</topic><topic>Population genetics</topic><topic>Principal Component Analysis</topic><topic>Selection, Genetic</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fumagalli, Matteo</creatorcontrib><creatorcontrib>Vieira, Filipe G</creatorcontrib><creatorcontrib>Korneliussen, Thorfinn Sand</creatorcontrib><creatorcontrib>Linderoth, Tyler</creatorcontrib><creatorcontrib>Huerta-Sánchez, Emilia</creatorcontrib><creatorcontrib>Albrechtsen, Anders</creatorcontrib><creatorcontrib>Nielsen, Rasmus</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Docstoc</collection><collection>University Readers</collection><collection>Calcium & Calcified Tissue Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Agricultural Science Collection</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Public Health Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>Consumer Health Database (Alumni Edition)</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Agriculture Science Database</collection><collection>Consumer Health Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>ProQuest Research Library</collection><collection>ProQuest Science Journals</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>ProQuest Biological Science Journals</collection><collection>Research Library (Corporate)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Genetics (Austin)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fumagalli, Matteo</au><au>Vieira, Filipe G</au><au>Korneliussen, Thorfinn Sand</au><au>Linderoth, Tyler</au><au>Huerta-Sánchez, Emilia</au><au>Albrechtsen, Anders</au><au>Nielsen, Rasmus</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Quantifying population genetic differentiation from next-generation sequencing data</atitle><jtitle>Genetics (Austin)</jtitle><addtitle>Genetics</addtitle><date>2013-11-01</date><risdate>2013</risdate><volume>195</volume><issue>3</issue><spage>979</spage><epage>992</epage><pages>979-992</pages><issn>1943-2631</issn><issn>0016-6731</issn><eissn>1943-2631</eissn><coden>GENTAE</coden><abstract>Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.</abstract><cop>United States</cop><pub>Genetics Society of America</pub><pmid>23979584</pmid><doi>10.1534/genetics.113.154740</doi><tpages>14</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1943-2631 |
ispartof | Genetics (Austin), 2013-11, Vol.195 (3), p.979-992 |
issn | 1943-2631 0016-6731 1943-2631 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3813878 |
source | Electronic Journals Library; MEDLINE; Oxford University Press Journals; Alma/SFX Local Collection |
subjects | Animals Bombyx - genetics Bombyx mori Cellular biology Computational Biology Computer Simulation Data Interpretation, Statistical Deoxyribonucleic acid DNA Genetic Drift Genetic Variation Genetics, Population - statistics & numerical data Genotype High-Throughput Nucleotide Sequencing - statistics & numerical data Investigations Likelihood Functions Models, Genetic Mutation Population genetics Principal Component Analysis Selection, Genetic |
title | Quantifying population genetic differentiation from next-generation sequencing data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T14%3A30%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Quantifying%20population%20genetic%20differentiation%20from%20next-generation%20sequencing%20data&rft.jtitle=Genetics%20(Austin)&rft.au=Fumagalli,%20Matteo&rft.date=2013-11-01&rft.volume=195&rft.issue=3&rft.spage=979&rft.epage=992&rft.pages=979-992&rft.issn=1943-2631&rft.eissn=1943-2631&rft.coden=GENTAE&rft_id=info:doi/10.1534/genetics.113.154740&rft_dat=%3Cproquest_pubme%3E3143693041%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1463868819&rft_id=info:pmid/23979584&rfr_iscdi=true |