Quantifying population genetic differentiation from next-generation sequencing data

Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genetics (Austin) 2013-11, Vol.195 (3), p.979-992
Hauptverfasser: Fumagalli, Matteo, Vieira, Filipe G, Korneliussen, Thorfinn Sand, Linderoth, Tyler, Huerta-Sánchez, Emilia, Albrechtsen, Anders, Nielsen, Rasmus
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 992
container_issue 3
container_start_page 979
container_title Genetics (Austin)
container_volume 195
creator Fumagalli, Matteo
Vieira, Filipe G
Korneliussen, Thorfinn Sand
Linderoth, Tyler
Huerta-Sánchez, Emilia
Albrechtsen, Anders
Nielsen, Rasmus
description Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.
doi_str_mv 10.1534/genetics.113.154740
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3813878</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3143693041</sourcerecordid><originalsourceid>FETCH-LOGICAL-c466t-de3a246185c89297a0b63722569b3542ce77c2a3a65a7c4ba8f814cc36d69bdc3</originalsourceid><addsrcrecordid>eNqFkU1LAzEQhoMo1q9fIEjBi5etm0w2yV4EKX6BIKKeQ5rN1pRtUpNdsf_elG2LevGUYeaZl3fyInSK8xEugF5OjTOt1XGEMaQO5TTfQQe4pJARBnj3Rz1AhzHO8jxnZSH20YBAyVNFD9DLc6dca-ulddPhwi-6RrXWu-FafFjZujbBJKTv18HPh858tdmKCH0zmo_OOL2SqFSrjtFerZpoTtbvEXq7vXkd32ePT3cP4-vHTFPG2qwyoAhlWBRalKTkKp8w4IQUrJxAQYk2nGuiQLFCcU0nStQCU62BVYmoNByhq1530U3mptLJZVCNXAQ7V2EpvbLy98TZdzn1nxIEBsFFErhYCwSfLoitnNuoTdMoZ3wXJS7yAkiZzP2PUloSLgiQhJ7_QWe-Cy79RKIYCCYEXglCT-ngYwym3vrGuVzlKzf5ypSv7PNNW2c_T97ubAKFb6Y5pRA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1463868819</pqid></control><display><type>article</type><title>Quantifying population genetic differentiation from next-generation sequencing data</title><source>Electronic Journals Library</source><source>MEDLINE</source><source>Oxford University Press Journals</source><source>Alma/SFX Local Collection</source><creator>Fumagalli, Matteo ; Vieira, Filipe G ; Korneliussen, Thorfinn Sand ; Linderoth, Tyler ; Huerta-Sánchez, Emilia ; Albrechtsen, Anders ; Nielsen, Rasmus</creator><creatorcontrib>Fumagalli, Matteo ; Vieira, Filipe G ; Korneliussen, Thorfinn Sand ; Linderoth, Tyler ; Huerta-Sánchez, Emilia ; Albrechtsen, Anders ; Nielsen, Rasmus</creatorcontrib><description>Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.</description><identifier>ISSN: 1943-2631</identifier><identifier>ISSN: 0016-6731</identifier><identifier>EISSN: 1943-2631</identifier><identifier>DOI: 10.1534/genetics.113.154740</identifier><identifier>PMID: 23979584</identifier><identifier>CODEN: GENTAE</identifier><language>eng</language><publisher>United States: Genetics Society of America</publisher><subject>Animals ; Bombyx - genetics ; Bombyx mori ; Cellular biology ; Computational Biology ; Computer Simulation ; Data Interpretation, Statistical ; Deoxyribonucleic acid ; DNA ; Genetic Drift ; Genetic Variation ; Genetics, Population - statistics &amp; numerical data ; Genotype ; High-Throughput Nucleotide Sequencing - statistics &amp; numerical data ; Investigations ; Likelihood Functions ; Models, Genetic ; Mutation ; Population genetics ; Principal Component Analysis ; Selection, Genetic</subject><ispartof>Genetics (Austin), 2013-11, Vol.195 (3), p.979-992</ispartof><rights>Copyright Genetics Society of America Nov 2013</rights><rights>Copyright © 2013 by the Genetics Society of America 2013</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c466t-de3a246185c89297a0b63722569b3542ce77c2a3a65a7c4ba8f814cc36d69bdc3</citedby><cites>FETCH-LOGICAL-c466t-de3a246185c89297a0b63722569b3542ce77c2a3a65a7c4ba8f814cc36d69bdc3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,777,781,882,27905,27906</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23979584$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Fumagalli, Matteo</creatorcontrib><creatorcontrib>Vieira, Filipe G</creatorcontrib><creatorcontrib>Korneliussen, Thorfinn Sand</creatorcontrib><creatorcontrib>Linderoth, Tyler</creatorcontrib><creatorcontrib>Huerta-Sánchez, Emilia</creatorcontrib><creatorcontrib>Albrechtsen, Anders</creatorcontrib><creatorcontrib>Nielsen, Rasmus</creatorcontrib><title>Quantifying population genetic differentiation from next-generation sequencing data</title><title>Genetics (Austin)</title><addtitle>Genetics</addtitle><description>Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.</description><subject>Animals</subject><subject>Bombyx - genetics</subject><subject>Bombyx mori</subject><subject>Cellular biology</subject><subject>Computational Biology</subject><subject>Computer Simulation</subject><subject>Data Interpretation, Statistical</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>Genetic Drift</subject><subject>Genetic Variation</subject><subject>Genetics, Population - statistics &amp; numerical data</subject><subject>Genotype</subject><subject>High-Throughput Nucleotide Sequencing - statistics &amp; numerical data</subject><subject>Investigations</subject><subject>Likelihood Functions</subject><subject>Models, Genetic</subject><subject>Mutation</subject><subject>Population genetics</subject><subject>Principal Component Analysis</subject><subject>Selection, Genetic</subject><issn>1943-2631</issn><issn>0016-6731</issn><issn>1943-2631</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNqFkU1LAzEQhoMo1q9fIEjBi5etm0w2yV4EKX6BIKKeQ5rN1pRtUpNdsf_elG2LevGUYeaZl3fyInSK8xEugF5OjTOt1XGEMaQO5TTfQQe4pJARBnj3Rz1AhzHO8jxnZSH20YBAyVNFD9DLc6dca-ulddPhwi-6RrXWu-FafFjZujbBJKTv18HPh858tdmKCH0zmo_OOL2SqFSrjtFerZpoTtbvEXq7vXkd32ePT3cP4-vHTFPG2qwyoAhlWBRalKTkKp8w4IQUrJxAQYk2nGuiQLFCcU0nStQCU62BVYmoNByhq1530U3mptLJZVCNXAQ7V2EpvbLy98TZdzn1nxIEBsFFErhYCwSfLoitnNuoTdMoZ3wXJS7yAkiZzP2PUloSLgiQhJ7_QWe-Cy79RKIYCCYEXglCT-ngYwym3vrGuVzlKzf5ypSv7PNNW2c_T97ubAKFb6Y5pRA</recordid><startdate>20131101</startdate><enddate>20131101</enddate><creator>Fumagalli, Matteo</creator><creator>Vieira, Filipe G</creator><creator>Korneliussen, Thorfinn Sand</creator><creator>Linderoth, Tyler</creator><creator>Huerta-Sánchez, Emilia</creator><creator>Albrechtsen, Anders</creator><creator>Nielsen, Rasmus</creator><general>Genetics Society of America</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>4T-</scope><scope>4U-</scope><scope>7QP</scope><scope>7SS</scope><scope>7TK</scope><scope>7TM</scope><scope>7X2</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8AO</scope><scope>8C1</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>K9-</scope><scope>K9.</scope><scope>LK8</scope><scope>M0K</scope><scope>M0R</scope><scope>M0S</scope><scope>M1P</scope><scope>M2O</scope><scope>M2P</scope><scope>M7N</scope><scope>M7P</scope><scope>MBDVC</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20131101</creationdate><title>Quantifying population genetic differentiation from next-generation sequencing data</title><author>Fumagalli, Matteo ; Vieira, Filipe G ; Korneliussen, Thorfinn Sand ; Linderoth, Tyler ; Huerta-Sánchez, Emilia ; Albrechtsen, Anders ; Nielsen, Rasmus</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c466t-de3a246185c89297a0b63722569b3542ce77c2a3a65a7c4ba8f814cc36d69bdc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Animals</topic><topic>Bombyx - genetics</topic><topic>Bombyx mori</topic><topic>Cellular biology</topic><topic>Computational Biology</topic><topic>Computer Simulation</topic><topic>Data Interpretation, Statistical</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>Genetic Drift</topic><topic>Genetic Variation</topic><topic>Genetics, Population - statistics &amp; numerical data</topic><topic>Genotype</topic><topic>High-Throughput Nucleotide Sequencing - statistics &amp; numerical data</topic><topic>Investigations</topic><topic>Likelihood Functions</topic><topic>Models, Genetic</topic><topic>Mutation</topic><topic>Population genetics</topic><topic>Principal Component Analysis</topic><topic>Selection, Genetic</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fumagalli, Matteo</creatorcontrib><creatorcontrib>Vieira, Filipe G</creatorcontrib><creatorcontrib>Korneliussen, Thorfinn Sand</creatorcontrib><creatorcontrib>Linderoth, Tyler</creatorcontrib><creatorcontrib>Huerta-Sánchez, Emilia</creatorcontrib><creatorcontrib>Albrechtsen, Anders</creatorcontrib><creatorcontrib>Nielsen, Rasmus</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Docstoc</collection><collection>University Readers</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Agricultural Science Collection</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Public Health Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central</collection><collection>Agricultural &amp; Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>Consumer Health Database (Alumni Edition)</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Agriculture Science Database</collection><collection>Consumer Health Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>ProQuest Research Library</collection><collection>ProQuest Science Journals</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>ProQuest Biological Science Journals</collection><collection>Research Library (Corporate)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Genetics (Austin)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fumagalli, Matteo</au><au>Vieira, Filipe G</au><au>Korneliussen, Thorfinn Sand</au><au>Linderoth, Tyler</au><au>Huerta-Sánchez, Emilia</au><au>Albrechtsen, Anders</au><au>Nielsen, Rasmus</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Quantifying population genetic differentiation from next-generation sequencing data</atitle><jtitle>Genetics (Austin)</jtitle><addtitle>Genetics</addtitle><date>2013-11-01</date><risdate>2013</risdate><volume>195</volume><issue>3</issue><spage>979</spage><epage>992</epage><pages>979-992</pages><issn>1943-2631</issn><issn>0016-6731</issn><eissn>1943-2631</eissn><coden>GENTAE</coden><abstract>Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.</abstract><cop>United States</cop><pub>Genetics Society of America</pub><pmid>23979584</pmid><doi>10.1534/genetics.113.154740</doi><tpages>14</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1943-2631
ispartof Genetics (Austin), 2013-11, Vol.195 (3), p.979-992
issn 1943-2631
0016-6731
1943-2631
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3813878
source Electronic Journals Library; MEDLINE; Oxford University Press Journals; Alma/SFX Local Collection
subjects Animals
Bombyx - genetics
Bombyx mori
Cellular biology
Computational Biology
Computer Simulation
Data Interpretation, Statistical
Deoxyribonucleic acid
DNA
Genetic Drift
Genetic Variation
Genetics, Population - statistics & numerical data
Genotype
High-Throughput Nucleotide Sequencing - statistics & numerical data
Investigations
Likelihood Functions
Models, Genetic
Mutation
Population genetics
Principal Component Analysis
Selection, Genetic
title Quantifying population genetic differentiation from next-generation sequencing data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T14%3A30%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Quantifying%20population%20genetic%20differentiation%20from%20next-generation%20sequencing%20data&rft.jtitle=Genetics%20(Austin)&rft.au=Fumagalli,%20Matteo&rft.date=2013-11-01&rft.volume=195&rft.issue=3&rft.spage=979&rft.epage=992&rft.pages=979-992&rft.issn=1943-2631&rft.eissn=1943-2631&rft.coden=GENTAE&rft_id=info:doi/10.1534/genetics.113.154740&rft_dat=%3Cproquest_pubme%3E3143693041%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1463868819&rft_id=info:pmid/23979584&rfr_iscdi=true