Exact limits of inference in coalescent models

Recovery of population size history from molecular sequence data is an important problem in population genetics. Inference commonly relies on a coalescent model linking the population size history to genealogies. The high computational cost of estimating parameters from these models usually compels...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Theoretical population biology 2019-02, Vol.125, p.75-93
Hauptverfasser: Johndrow, James E., Palacios, Julia A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 93
container_issue
container_start_page 75
container_title Theoretical population biology
container_volume 125
creator Johndrow, James E.
Palacios, Julia A.
description Recovery of population size history from molecular sequence data is an important problem in population genetics. Inference commonly relies on a coalescent model linking the population size history to genealogies. The high computational cost of estimating parameters from these models usually compels researchers to select a subset of the available data or to rely on insufficient summary statistics for statistical inference. We consider the problem of recovering the true population size history from two possible alternatives on the basis of coalescent time data previously considered by Kim et al. (2015). We improve upon previous results by giving exact expressions for the probability of correctly distinguishing between the two hypotheses as a function of the separation between the alternative size histories, the number of individuals, loci, and the sampling times. In more complicated settings we estimate the exact probability of correct recovery by Monte Carlo simulation. Our results give considerably more pessimistic inferential limits than those previously reported. We also extended our analyses to pairwise SMC and SMC’ models of recombination. This work is relevant for optimal design when the inference goal is to test scientific hypotheses about population size trajectories in coalescent models with and without recombination.
doi_str_mv 10.1016/j.tpb.2018.11.004
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6541399</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0040580918300248</els_id><sourcerecordid>2159987228</sourcerecordid><originalsourceid>FETCH-LOGICAL-c451t-71e6aa6a11e51c58f8c9e1d435f0da348c29526c5b73c28716e28437fb4e4b573</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMoun78AC_So5fWTNK0CYIgsn6A4EXPIU2nmqVt1qQr-u_Nsip68TTDzDvvzDyEHAMtgEJ1tiimZVMwCrIAKCgtt8gMqKpyypnYJrNUobmQVO2R_RgXlFIJnO-SPU5FDUqoGSnm78ZOWe8GN8XMd5kbOww4WkxZZr3pMVocp2zwLfbxkOx0po949BUPyNP1_PHqNr9_uLm7urzPbSlgymvAypjKAKAAK2QnrUJoSy462hpeSsuUYJUVTc0tkzVUyGTJ664psWxEzQ_IxcZ3uWoGbNcXBNPrZXCDCR_aG6f_dkb3op_9m65ECVypZHD6ZRD86wrjpAeXHul7M6JfRc1AKCVrxmSSwkZqg48xYPezBqhec9YLnTjrNWcNoBPVNHPy-76fiW-wSXC-ESRo-OYw6GjdGmvrAtpJt979Y_8J09KNpA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2159987228</pqid></control><display><type>article</type><title>Exact limits of inference in coalescent models</title><source>MEDLINE</source><source>ScienceDirect Journals (5 years ago - present)</source><creator>Johndrow, James E. ; Palacios, Julia A.</creator><creatorcontrib>Johndrow, James E. ; Palacios, Julia A.</creatorcontrib><description>Recovery of population size history from molecular sequence data is an important problem in population genetics. Inference commonly relies on a coalescent model linking the population size history to genealogies. The high computational cost of estimating parameters from these models usually compels researchers to select a subset of the available data or to rely on insufficient summary statistics for statistical inference. We consider the problem of recovering the true population size history from two possible alternatives on the basis of coalescent time data previously considered by Kim et al. (2015). We improve upon previous results by giving exact expressions for the probability of correctly distinguishing between the two hypotheses as a function of the separation between the alternative size histories, the number of individuals, loci, and the sampling times. In more complicated settings we estimate the exact probability of correct recovery by Monte Carlo simulation. Our results give considerably more pessimistic inferential limits than those previously reported. We also extended our analyses to pairwise SMC and SMC’ models of recombination. This work is relevant for optimal design when the inference goal is to test scientific hypotheses about population size trajectories in coalescent models with and without recombination.</description><identifier>ISSN: 0040-5809</identifier><identifier>EISSN: 1096-0325</identifier><identifier>DOI: 10.1016/j.tpb.2018.11.004</identifier><identifier>PMID: 30571959</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Bayes error rates ; Bayes Theorem ; Coalescent ; Effective population size ; Genetic Variation ; Genetics, Population - statistics &amp; numerical data ; Markov Chains ; Molecular Sequence Data ; Population Density ; Sequentially Markov coalescent</subject><ispartof>Theoretical population biology, 2019-02, Vol.125, p.75-93</ispartof><rights>2018 The Authors</rights><rights>Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c451t-71e6aa6a11e51c58f8c9e1d435f0da348c29526c5b73c28716e28437fb4e4b573</citedby><cites>FETCH-LOGICAL-c451t-71e6aa6a11e51c58f8c9e1d435f0da348c29526c5b73c28716e28437fb4e4b573</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.tpb.2018.11.004$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>230,314,780,784,885,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30571959$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Johndrow, James E.</creatorcontrib><creatorcontrib>Palacios, Julia A.</creatorcontrib><title>Exact limits of inference in coalescent models</title><title>Theoretical population biology</title><addtitle>Theor Popul Biol</addtitle><description>Recovery of population size history from molecular sequence data is an important problem in population genetics. Inference commonly relies on a coalescent model linking the population size history to genealogies. The high computational cost of estimating parameters from these models usually compels researchers to select a subset of the available data or to rely on insufficient summary statistics for statistical inference. We consider the problem of recovering the true population size history from two possible alternatives on the basis of coalescent time data previously considered by Kim et al. (2015). We improve upon previous results by giving exact expressions for the probability of correctly distinguishing between the two hypotheses as a function of the separation between the alternative size histories, the number of individuals, loci, and the sampling times. In more complicated settings we estimate the exact probability of correct recovery by Monte Carlo simulation. Our results give considerably more pessimistic inferential limits than those previously reported. We also extended our analyses to pairwise SMC and SMC’ models of recombination. This work is relevant for optimal design when the inference goal is to test scientific hypotheses about population size trajectories in coalescent models with and without recombination.</description><subject>Bayes error rates</subject><subject>Bayes Theorem</subject><subject>Coalescent</subject><subject>Effective population size</subject><subject>Genetic Variation</subject><subject>Genetics, Population - statistics &amp; numerical data</subject><subject>Markov Chains</subject><subject>Molecular Sequence Data</subject><subject>Population Density</subject><subject>Sequentially Markov coalescent</subject><issn>0040-5809</issn><issn>1096-0325</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kE1LxDAQhoMoun78AC_So5fWTNK0CYIgsn6A4EXPIU2nmqVt1qQr-u_Nsip68TTDzDvvzDyEHAMtgEJ1tiimZVMwCrIAKCgtt8gMqKpyypnYJrNUobmQVO2R_RgXlFIJnO-SPU5FDUqoGSnm78ZOWe8GN8XMd5kbOww4WkxZZr3pMVocp2zwLfbxkOx0po949BUPyNP1_PHqNr9_uLm7urzPbSlgymvAypjKAKAAK2QnrUJoSy462hpeSsuUYJUVTc0tkzVUyGTJ664psWxEzQ_IxcZ3uWoGbNcXBNPrZXCDCR_aG6f_dkb3op_9m65ECVypZHD6ZRD86wrjpAeXHul7M6JfRc1AKCVrxmSSwkZqg48xYPezBqhec9YLnTjrNWcNoBPVNHPy-76fiW-wSXC-ESRo-OYw6GjdGmvrAtpJt979Y_8J09KNpA</recordid><startdate>20190201</startdate><enddate>20190201</enddate><creator>Johndrow, James E.</creator><creator>Palacios, Julia A.</creator><general>Elsevier Inc</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20190201</creationdate><title>Exact limits of inference in coalescent models</title><author>Johndrow, James E. ; Palacios, Julia A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c451t-71e6aa6a11e51c58f8c9e1d435f0da348c29526c5b73c28716e28437fb4e4b573</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Bayes error rates</topic><topic>Bayes Theorem</topic><topic>Coalescent</topic><topic>Effective population size</topic><topic>Genetic Variation</topic><topic>Genetics, Population - statistics &amp; numerical data</topic><topic>Markov Chains</topic><topic>Molecular Sequence Data</topic><topic>Population Density</topic><topic>Sequentially Markov coalescent</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Johndrow, James E.</creatorcontrib><creatorcontrib>Palacios, Julia A.</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Theoretical population biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Johndrow, James E.</au><au>Palacios, Julia A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Exact limits of inference in coalescent models</atitle><jtitle>Theoretical population biology</jtitle><addtitle>Theor Popul Biol</addtitle><date>2019-02-01</date><risdate>2019</risdate><volume>125</volume><spage>75</spage><epage>93</epage><pages>75-93</pages><issn>0040-5809</issn><eissn>1096-0325</eissn><abstract>Recovery of population size history from molecular sequence data is an important problem in population genetics. Inference commonly relies on a coalescent model linking the population size history to genealogies. The high computational cost of estimating parameters from these models usually compels researchers to select a subset of the available data or to rely on insufficient summary statistics for statistical inference. We consider the problem of recovering the true population size history from two possible alternatives on the basis of coalescent time data previously considered by Kim et al. (2015). We improve upon previous results by giving exact expressions for the probability of correctly distinguishing between the two hypotheses as a function of the separation between the alternative size histories, the number of individuals, loci, and the sampling times. In more complicated settings we estimate the exact probability of correct recovery by Monte Carlo simulation. Our results give considerably more pessimistic inferential limits than those previously reported. We also extended our analyses to pairwise SMC and SMC’ models of recombination. This work is relevant for optimal design when the inference goal is to test scientific hypotheses about population size trajectories in coalescent models with and without recombination.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>30571959</pmid><doi>10.1016/j.tpb.2018.11.004</doi><tpages>19</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0040-5809
ispartof Theoretical population biology, 2019-02, Vol.125, p.75-93
issn 0040-5809
1096-0325
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6541399
source MEDLINE; ScienceDirect Journals (5 years ago - present)
subjects Bayes error rates
Bayes Theorem
Coalescent
Effective population size
Genetic Variation
Genetics, Population - statistics & numerical data
Markov Chains
Molecular Sequence Data
Population Density
Sequentially Markov coalescent
title Exact limits of inference in coalescent models
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T01%3A44%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Exact%20limits%20of%20inference%20in%20coalescent%20models&rft.jtitle=Theoretical%20population%20biology&rft.au=Johndrow,%20James%20E.&rft.date=2019-02-01&rft.volume=125&rft.spage=75&rft.epage=93&rft.pages=75-93&rft.issn=0040-5809&rft.eissn=1096-0325&rft_id=info:doi/10.1016/j.tpb.2018.11.004&rft_dat=%3Cproquest_pubme%3E2159987228%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2159987228&rft_id=info:pmid/30571959&rft_els_id=S0040580918300248&rfr_iscdi=true