Quantifying population genetic differentiation from next-generation sequencing data

Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Genetics (Austin) 2013-11, Vol.195 (3), p.979-992
Hauptverfasser:	Fumagalli, Matteo, Vieira, Filipe G, Korneliussen, Thorfinn Sand, Linderoth, Tyler, Huerta-Sánchez, Emilia, Albrechtsen, Anders, Nielsen, Rasmus
Format:	Artikel
Sprache:	eng
Schlagworte:	Animals Bombyx - genetics Bombyx mori Cellular biology Computational Biology Computer Simulation Data Interpretation, Statistical Deoxyribonucleic acid DNA Genetic Drift Genetic Variation Genetics, Population - statistics & numerical data Genotype High-Throughput Nucleotide Sequencing - statistics & numerical data Investigations Likelihood Functions Models, Genetic Mutation Population genetics Principal Component Analysis Selection, Genetic
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	992
container_issue	3
container_start_page	979
container_title	Genetics (Austin)
container_volume	195
creator	Fumagalli, Matteo Vieira, Filipe G Korneliussen, Thorfinn Sand Linderoth, Tyler Huerta-Sánchez, Emilia Albrechtsen, Anders Nielsen, Rasmus
description	Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.
doi_str_mv	10.1534/genetics.113.154740
format	Article
fullrecord	<record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3813878</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3143693041</sourcerecordid><originalsourceid>FETCH-LOGICAL-c466t-de3a246185c89297a0b63722569b3542ce77c2a3a65a7c4ba8f814cc36d69bdc3</originalsourceid><addsrcrecordid>eNqFkU1LAzEQhoMo1q9fIEjBi5etm0w2yV4EKX6BIKKeQ5rN1pRtUpNdsf_elG2LevGUYeaZl3fyInSK8xEugF5OjTOt1XGEMaQO5TTfQQe4pJARBnj3Rz1AhzHO8jxnZSH20YBAyVNFD9DLc6dca-ulddPhwi-6RrXWu-FafFjZujbBJKTv18HPh858tdmKCH0zmo_OOL2SqFSrjtFerZpoTtbvEXq7vXkd32ePT3cP4-vHTFPG2qwyoAhlWBRalKTkKp8w4IQUrJxAQYk2nGuiQLFCcU0nStQCU62BVYmoNByhq1530U3mptLJZVCNXAQ7V2EpvbLy98TZdzn1nxIEBsFFErhYCwSfLoitnNuoTdMoZ3wXJS7yAkiZzP2PUloSLgiQhJ7_QWe-Cy79RKIYCCYEXglCT-ngYwym3vrGuVzlKzf5ypSv7PNNW2c_T97ubAKFb6Y5pRA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1463868819</pqid></control><display><type>article</type><title>Quantifying population genetic differentiation from next-generation sequencing data</title><source>Electronic Journals Library</source><source>MEDLINE</source><source>Oxford University Press Journals</source><source>Alma/SFX Local Collection</source><creator>Fumagalli, Matteo ; Vieira, Filipe G ; Korneliussen, Thorfinn Sand ; Linderoth, Tyler ; Huerta-Sánchez, Emilia ; Albrechtsen, Anders ; Nielsen, Rasmus</creator><creatorcontrib>Fumagalli, Matteo ; Vieira, Filipe G ; Korneliussen, Thorfinn Sand ; Linderoth, Tyler ; Huerta-Sánchez, Emilia ; Albrechtsen, Anders ; Nielsen, Rasmus</creatorcontrib><description>Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.</description><identifier>ISSN: 1943-2631</identifier><identifier>ISSN: 0016-6731</identifier><identifier>EISSN: 1943-2631</identifier><identifier>DOI: 10.1534/genetics.113.154740</identifier><identifier>PMID: 23979584</identifier><identifier>CODEN: GENTAE</identifier><language>eng</language><publisher>United States: Genetics Society of America</publisher><subject>Animals ; Bombyx - genetics ; Bombyx mori ; Cellular biology ; Computational Biology ; Computer Simulation ; Data Interpretation, Statistical ; Deoxyribonucleic acid ; DNA ; Genetic Drift ; Genetic Variation ; Genetics, Population - statistics & numerical data ; Genotype ; High-Throughput Nucleotide Sequencing - statistics & numerical data ; Investigations ; Likelihood Functions ; Models, Genetic ; Mutation ; Population genetics ; Principal Component Analysis ; Selection, Genetic</subject><ispartof>Genetics (Austin), 2013-11, Vol.195 (3), p.979-992</ispartof><rights>Copyright Genetics Society of America Nov 2013</rights><rights>Copyright © 2013 by the Genetics Society of America 2013</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c466t-de3a246185c89297a0b63722569b3542ce77c2a3a65a7c4ba8f814cc36d69bdc3</citedby><cites>FETCH-LOGICAL-c466t-de3a246185c89297a0b63722569b3542ce77c2a3a65a7c4ba8f814cc36d69bdc3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,777,781,882,27905,27906</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23979584$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Fumagalli, Matteo</creatorcontrib><creatorcontrib>Vieira, Filipe G</creatorcontrib><creatorcontrib>Korneliussen, Thorfinn Sand</creatorcontrib><creatorcontrib>Linderoth, Tyler</creatorcontrib><creatorcontrib>Huerta-Sánchez, Emilia</creatorcontrib><creatorcontrib>Albrechtsen, Anders</creatorcontrib><creatorcontrib>Nielsen, Rasmus</creatorcontrib><title>Quantifying population genetic differentiation from next-generation sequencing data</title><title>Genetics (Austin)</title><addtitle>Genetics</addtitle><description>Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.</description><subject>Animals</subject><subject>Bombyx - genetics</subject><subject>Bombyx mori</subject><subject>Cellular biology</subject><subject>Computational Biology</subject><subject>Computer Simulation</subject><subject>Data Interpretation, Statistical</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>Genetic Drift</subject><subject>Genetic Variation</subject><subject>Genetics, Population - statistics & numerical data</subject><subject>Genotype</subject><subject>High-Throughput Nucleotide Sequencing - statistics & numerical data</subject><subject>Investigations</subject><subject>Likelihood Functions</subject><subject>Models, Genetic</subject><subject>Mutation</subject><subject>Population genetics</subject><subject>Principal Component Analysis</subject><subject>Selection, Genetic</subject><issn>1943-2631</issn><issn>0016-6731</issn><issn>1943-2631</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNqFkU1LAzEQhoMo1q9fIEjBi5etm0w2yV4EKX6BIKKeQ5rN1pRtUpNdsf_elG2LevGUYeaZl3fyInSK8xEugF5OjTOt1XGEMaQO5TTfQQe4pJARBnj3Rz1AhzHO8jxnZSH20YBAyVNFD9DLc6dca-ulddPhwi-6RrXWu-FafFjZujbBJKTv18HPh858tdmKCH0zmo_OOL2SqFSrjtFerZpoTtbvEXq7vXkd32ePT3cP4-vHTFPG2qwyoAhlWBRalKTkKp8w4IQUrJxAQYk2nGuiQLFCcU0nStQCU62BVYmoNByhq1530U3mptLJZVCNXAQ7V2EpvbLy98TZdzn1nxIEBsFFErhYCwSfLoitnNuoTdMoZ3wXJS7yAkiZzP2PUloSLgiQhJ7_QWe-Cy79RKIYCCYEXglCT-ngYwym3vrGuVzlKzf5ypSv7PNNW2c_T97ubAKFb6Y5pRA</recordid><startdate>20131101</startdate><enddate>20131101</enddate><creator>Fumagalli, Matteo</creator><creator>Vieira, Filipe G</creator><creator>Korneliussen, Thorfinn Sand</creator><creator>Linderoth, Tyler</creator><creator>Huerta-Sánchez, Emilia</creator><creator>Albrechtsen, Anders</creator><creator>Nielsen, Rasmus</creator><general>Genetics Society of America</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>4T-</scope><scope>4U-</scope><scope>7QP</scope><scope>7SS</scope><scope>7TK</scope><scope>7TM</scope><scope>7X2</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8AO</scope><scope>8C1</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>K9-</scope><scope>K9.</scope><scope>LK8</scope><scope>M0K</scope><scope>M0R</scope><scope>M0S</scope><scope>M1P</scope><scope>M2O</scope><scope>M2P</scope><scope>M7N</scope><scope>M7P</scope><scope>MBDVC</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20131101</creationdate><title>Quantifying population genetic differentiation from next-generation sequencing data</title><author>Fumagalli, Matteo ; Vieira, Filipe G ; Korneliussen, Thorfinn Sand ; Linderoth, Tyler ; Huerta-Sánchez, Emilia ; Albrechtsen, Anders ; Nielsen, Rasmus</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c466t-de3a246185c89297a0b63722569b3542ce77c2a3a65a7c4ba8f814cc36d69bdc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Animals</topic><topic>Bombyx - genetics</topic><topic>Bombyx mori</topic><topic>Cellular biology</topic><topic>Computational Biology</topic><topic>Computer Simulation</topic><topic>Data Interpretation, Statistical</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>Genetic Drift</topic><topic>Genetic Variation</topic><topic>Genetics, Population - statistics & numerical data</topic><topic>Genotype</topic><topic>High-Throughput Nucleotide Sequencing - statistics & numerical data</topic><topic>Investigations</topic><topic>Likelihood Functions</topic><topic>Models, Genetic</topic><topic>Mutation</topic><topic>Population genetics</topic><topic>Principal Component Analysis</topic><topic>Selection, Genetic</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fumagalli, Matteo</creatorcontrib><creatorcontrib>Vieira, Filipe G</creatorcontrib><creatorcontrib>Korneliussen, Thorfinn Sand</creatorcontrib><creatorcontrib>Linderoth, Tyler</creatorcontrib><creatorcontrib>Huerta-Sánchez, Emilia</creatorcontrib><creatorcontrib>Albrechtsen, Anders</creatorcontrib><creatorcontrib>Nielsen, Rasmus</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Docstoc</collection><collection>University Readers</collection><collection>Calcium & Calcified Tissue Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Agricultural Science Collection</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Public Health Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>Consumer Health Database (Alumni Edition)</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Agriculture Science Database</collection><collection>Consumer Health Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>ProQuest Research Library</collection><collection>ProQuest Science Journals</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>ProQuest Biological Science Journals</collection><collection>Research Library (Corporate)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Genetics (Austin)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fumagalli, Matteo</au><au>Vieira, Filipe G</au><au>Korneliussen, Thorfinn Sand</au><au>Linderoth, Tyler</au><au>Huerta-Sánchez, Emilia</au><au>Albrechtsen, Anders</au><au>Nielsen, Rasmus</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Quantifying population genetic differentiation from next-generation sequencing data</atitle><jtitle>Genetics (Austin)</jtitle><addtitle>Genetics</addtitle><date>2013-11-01</date><risdate>2013</risdate><volume>195</volume><issue>3</issue><spage>979</spage><epage>992</epage><pages>979-992</pages><issn>1943-2631</issn><issn>0016-6731</issn><eissn>1943-2631</eissn><coden>GENTAE</coden><abstract>Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.</abstract><cop>United States</cop><pub>Genetics Society of America</pub><pmid>23979584</pmid><doi>10.1534/genetics.113.154740</doi><tpages>14</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1943-2631
ispartof	Genetics (Austin), 2013-11, Vol.195 (3), p.979-992
issn	1943-2631 0016-6731 1943-2631
language	eng
recordid	cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3813878
source	Electronic Journals Library; MEDLINE; Oxford University Press Journals; Alma/SFX Local Collection
subjects	Animals Bombyx - genetics Bombyx mori Cellular biology Computational Biology Computer Simulation Data Interpretation, Statistical Deoxyribonucleic acid DNA Genetic Drift Genetic Variation Genetics, Population - statistics & numerical data Genotype High-Throughput Nucleotide Sequencing - statistics & numerical data Investigations Likelihood Functions Models, Genetic Mutation Population genetics Principal Component Analysis Selection, Genetic
title	Quantifying population genetic differentiation from next-generation sequencing data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T14%3A30%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Quantifying%20population%20genetic%20differentiation%20from%20next-generation%20sequencing%20data&rft.jtitle=Genetics%20(Austin)&rft.au=Fumagalli,%20Matteo&rft.date=2013-11-01&rft.volume=195&rft.issue=3&rft.spage=979&rft.epage=992&rft.pages=979-992&rft.issn=1943-2631&rft.eissn=1943-2631&rft.coden=GENTAE&rft_id=info:doi/10.1534/genetics.113.154740&rft_dat=%3Cproquest_pubme%3E3143693041%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1463868819&rft_id=info:pmid/23979584&rfr_iscdi=true