SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation

Species delimitation tools are vital to taxonomy and the discovery of new species. These tools can make use of genetic data to estimate species boundaries, where one of the most widely used methods is the Generalized Mixed Yule Coalescent (GMYC) model. Despite its popularity, a number of factors are...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Molecular ecology resources 2022-07, Vol.22 (5), p.2054-2069
Hauptverfasser: Steenderen, Clarke J. M., Sutton, Guy F.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2069
container_issue 5
container_start_page 2054
container_title Molecular ecology resources
container_volume 22
creator Steenderen, Clarke J. M.
Sutton, Guy F.
description Species delimitation tools are vital to taxonomy and the discovery of new species. These tools can make use of genetic data to estimate species boundaries, where one of the most widely used methods is the Generalized Mixed Yule Coalescent (GMYC) model. Despite its popularity, a number of factors are known to influence the performance and resulting inferences of the GMYC. Moreover, the few studies that have assessed model performance to date have been predominantly based on simulated data sets, where model assumptions are not violated. Here, we present a user‐friendly R Shiny application, ‘SPEDE‐sampler’ (SPEcies DElimitation sampler), that assesses the effect of computational and methodological choices, in combination with sampling effects, on the GMYC model. Output phylogenies are used to test the effect that (1) sample size, (2) BEAST and GMYC parameters (e.g. prior settings, single vs multiple threshold, clock model), and (3) singletons have on GMYC output. Optional predefined grouping information (e.g. morphospecies/ecotypes) can be uploaded in order to compare it with GMYC species and estimate percentage match scores. Additionally, predefined groups that contribute to inflated species richness estimates are identified by SPEDE‐sampler, allowing for the further investigation of potential cryptic species or geographical substructuring in those groups. Merging by the GMYC is also recorded to identify where traditional taxonomy has overestimated species numbers. Four worked examples are provided to illustrate the functionality of the program's workflow, and the variation that can arise when applying the GMYC model to empirical data sets. The R Shiny program is available for download at https://github.com/clarkevansteenderen/spede_sampler_R.
doi_str_mv 10.1111/1755-0998.13591
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9306842</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2673565605</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4211-53a3c71df9413d7426a52d0a478e66d06003401fc4f493fb6a1c80121a7240bb3</originalsourceid><addsrcrecordid>eNqFksFuEzEQhlcIREvhzA1Z4sIlrb22d9cckKqQFqQWEAUJTpbjnc26cuzF9rYNJx6Bp-DBeBKcpETABR9sy_PN7_k1UxSPCT4keR2RmvMJFqI5JJQLcqfY373c3d2bT3vFgxgvMa6wqNn9Yo9yLBjH5X7x4-Ld7OXs57fvUS0HC-E5OnboPbrojVshNQzWaJWMdyh5pGKEGFHvr9ESUu9bb_0ixy3SvTcaIlKuRUndZHwjZ9wCaeWQ6jrQCZ2Cg6Cs-QotOjc3ef88WkBTryxEDS4hP6ZhTBsZ4xKEIUDafP-wuNcpG-HR7XlQfDyZfZi-mpy9PX09PT6baFYSMuFUUV2TthOM0LZmZaV42WLF6gaqqs3-MWWYdJp1TNBuXimiG0xKouqS4fmcHhQvtrrDOF9Cuy4qVyyHYJYqrKRXRv4dcaaXC38lBcVVw8os8OxWIPgvI8QklyZ7s1Y58GOUZVUyIkTZNBl9-g966cfgsr1M1ZRXvMI8U0dbSgcfY4BuVwzBcj0Dct1lue643MxAznjyp4cd_7vpGeBb4NpYWP1PT57P3myFfwHja79A</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2673565605</pqid></control><display><type>article</type><title>SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation</title><source>MEDLINE</source><source>Wiley Online Library Journals Frontfile Complete</source><creator>Steenderen, Clarke J. M. ; Sutton, Guy F.</creator><creatorcontrib>Steenderen, Clarke J. M. ; Sutton, Guy F.</creatorcontrib><description>Species delimitation tools are vital to taxonomy and the discovery of new species. These tools can make use of genetic data to estimate species boundaries, where one of the most widely used methods is the Generalized Mixed Yule Coalescent (GMYC) model. Despite its popularity, a number of factors are known to influence the performance and resulting inferences of the GMYC. Moreover, the few studies that have assessed model performance to date have been predominantly based on simulated data sets, where model assumptions are not violated. Here, we present a user‐friendly R Shiny application, ‘SPEDE‐sampler’ (SPEcies DElimitation sampler), that assesses the effect of computational and methodological choices, in combination with sampling effects, on the GMYC model. Output phylogenies are used to test the effect that (1) sample size, (2) BEAST and GMYC parameters (e.g. prior settings, single vs multiple threshold, clock model), and (3) singletons have on GMYC output. Optional predefined grouping information (e.g. morphospecies/ecotypes) can be uploaded in order to compare it with GMYC species and estimate percentage match scores. Additionally, predefined groups that contribute to inflated species richness estimates are identified by SPEDE‐sampler, allowing for the further investigation of potential cryptic species or geographical substructuring in those groups. Merging by the GMYC is also recorded to identify where traditional taxonomy has overestimated species numbers. Four worked examples are provided to illustrate the functionality of the program's workflow, and the variation that can arise when applying the GMYC model to empirical data sets. The R Shiny program is available for download at https://github.com/clarkevansteenderen/spede_sampler_R.</description><identifier>ISSN: 1755-098X</identifier><identifier>EISSN: 1755-0998</identifier><identifier>DOI: 10.1111/1755-0998.13591</identifier><identifier>PMID: 35094502</identifier><language>eng</language><publisher>England: Wiley Subscription Services, Inc</publisher><subject>barcoding ; Computer applications ; Cryptic species ; Datasets ; DNA Barcoding, Taxonomic - methods ; Ecotypes ; New species ; Phylogeny ; Resource ; RESOURCE ARTICLES ; Sample Size ; Sampling ; singletons ; species delimitation ; Species richness ; Taxonomy ; Workflow</subject><ispartof>Molecular ecology resources, 2022-07, Vol.22 (5), p.2054-2069</ispartof><rights>2022 The Authors. published by John Wiley &amp; Sons Ltd.</rights><rights>2022 The Authors. Molecular Ecology Resources published by John Wiley &amp; Sons Ltd.</rights><rights>2022. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c4211-53a3c71df9413d7426a52d0a478e66d06003401fc4f493fb6a1c80121a7240bb3</cites><orcidid>0000-0003-2405-0945 ; 0000-0002-4219-446X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1111%2F1755-0998.13591$$EPDF$$P50$$Gwiley$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1111%2F1755-0998.13591$$EHTML$$P50$$Gwiley$$Hfree_for_read</linktohtml><link.rule.ids>230,314,776,780,881,1411,27901,27902,45550,45551</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35094502$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Steenderen, Clarke J. M.</creatorcontrib><creatorcontrib>Sutton, Guy F.</creatorcontrib><title>SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation</title><title>Molecular ecology resources</title><addtitle>Mol Ecol Resour</addtitle><description>Species delimitation tools are vital to taxonomy and the discovery of new species. These tools can make use of genetic data to estimate species boundaries, where one of the most widely used methods is the Generalized Mixed Yule Coalescent (GMYC) model. Despite its popularity, a number of factors are known to influence the performance and resulting inferences of the GMYC. Moreover, the few studies that have assessed model performance to date have been predominantly based on simulated data sets, where model assumptions are not violated. Here, we present a user‐friendly R Shiny application, ‘SPEDE‐sampler’ (SPEcies DElimitation sampler), that assesses the effect of computational and methodological choices, in combination with sampling effects, on the GMYC model. Output phylogenies are used to test the effect that (1) sample size, (2) BEAST and GMYC parameters (e.g. prior settings, single vs multiple threshold, clock model), and (3) singletons have on GMYC output. Optional predefined grouping information (e.g. morphospecies/ecotypes) can be uploaded in order to compare it with GMYC species and estimate percentage match scores. Additionally, predefined groups that contribute to inflated species richness estimates are identified by SPEDE‐sampler, allowing for the further investigation of potential cryptic species or geographical substructuring in those groups. Merging by the GMYC is also recorded to identify where traditional taxonomy has overestimated species numbers. Four worked examples are provided to illustrate the functionality of the program's workflow, and the variation that can arise when applying the GMYC model to empirical data sets. The R Shiny program is available for download at https://github.com/clarkevansteenderen/spede_sampler_R.</description><subject>barcoding</subject><subject>Computer applications</subject><subject>Cryptic species</subject><subject>Datasets</subject><subject>DNA Barcoding, Taxonomic - methods</subject><subject>Ecotypes</subject><subject>New species</subject><subject>Phylogeny</subject><subject>Resource</subject><subject>RESOURCE ARTICLES</subject><subject>Sample Size</subject><subject>Sampling</subject><subject>singletons</subject><subject>species delimitation</subject><subject>Species richness</subject><subject>Taxonomy</subject><subject>Workflow</subject><issn>1755-098X</issn><issn>1755-0998</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>24P</sourceid><sourceid>EIF</sourceid><recordid>eNqFksFuEzEQhlcIREvhzA1Z4sIlrb22d9cckKqQFqQWEAUJTpbjnc26cuzF9rYNJx6Bp-DBeBKcpETABR9sy_PN7_k1UxSPCT4keR2RmvMJFqI5JJQLcqfY373c3d2bT3vFgxgvMa6wqNn9Yo9yLBjH5X7x4-Ld7OXs57fvUS0HC-E5OnboPbrojVshNQzWaJWMdyh5pGKEGFHvr9ESUu9bb_0ixy3SvTcaIlKuRUndZHwjZ9wCaeWQ6jrQCZ2Cg6Cs-QotOjc3ef88WkBTryxEDS4hP6ZhTBsZ4xKEIUDafP-wuNcpG-HR7XlQfDyZfZi-mpy9PX09PT6baFYSMuFUUV2TthOM0LZmZaV42WLF6gaqqs3-MWWYdJp1TNBuXimiG0xKouqS4fmcHhQvtrrDOF9Cuy4qVyyHYJYqrKRXRv4dcaaXC38lBcVVw8os8OxWIPgvI8QklyZ7s1Y58GOUZVUyIkTZNBl9-g966cfgsr1M1ZRXvMI8U0dbSgcfY4BuVwzBcj0Dct1lue643MxAznjyp4cd_7vpGeBb4NpYWP1PT57P3myFfwHja79A</recordid><startdate>202207</startdate><enddate>202207</enddate><creator>Steenderen, Clarke J. M.</creator><creator>Sutton, Guy F.</creator><general>Wiley Subscription Services, Inc</general><general>John Wiley and Sons Inc</general><scope>24P</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SN</scope><scope>7SS</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-2405-0945</orcidid><orcidid>https://orcid.org/0000-0002-4219-446X</orcidid></search><sort><creationdate>202207</creationdate><title>SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation</title><author>Steenderen, Clarke J. M. ; Sutton, Guy F.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4211-53a3c71df9413d7426a52d0a478e66d06003401fc4f493fb6a1c80121a7240bb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>barcoding</topic><topic>Computer applications</topic><topic>Cryptic species</topic><topic>Datasets</topic><topic>DNA Barcoding, Taxonomic - methods</topic><topic>Ecotypes</topic><topic>New species</topic><topic>Phylogeny</topic><topic>Resource</topic><topic>RESOURCE ARTICLES</topic><topic>Sample Size</topic><topic>Sampling</topic><topic>singletons</topic><topic>species delimitation</topic><topic>Species richness</topic><topic>Taxonomy</topic><topic>Workflow</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Steenderen, Clarke J. M.</creatorcontrib><creatorcontrib>Sutton, Guy F.</creatorcontrib><collection>Wiley Online Library Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Molecular ecology resources</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Steenderen, Clarke J. M.</au><au>Sutton, Guy F.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation</atitle><jtitle>Molecular ecology resources</jtitle><addtitle>Mol Ecol Resour</addtitle><date>2022-07</date><risdate>2022</risdate><volume>22</volume><issue>5</issue><spage>2054</spage><epage>2069</epage><pages>2054-2069</pages><issn>1755-098X</issn><eissn>1755-0998</eissn><abstract>Species delimitation tools are vital to taxonomy and the discovery of new species. These tools can make use of genetic data to estimate species boundaries, where one of the most widely used methods is the Generalized Mixed Yule Coalescent (GMYC) model. Despite its popularity, a number of factors are known to influence the performance and resulting inferences of the GMYC. Moreover, the few studies that have assessed model performance to date have been predominantly based on simulated data sets, where model assumptions are not violated. Here, we present a user‐friendly R Shiny application, ‘SPEDE‐sampler’ (SPEcies DElimitation sampler), that assesses the effect of computational and methodological choices, in combination with sampling effects, on the GMYC model. Output phylogenies are used to test the effect that (1) sample size, (2) BEAST and GMYC parameters (e.g. prior settings, single vs multiple threshold, clock model), and (3) singletons have on GMYC output. Optional predefined grouping information (e.g. morphospecies/ecotypes) can be uploaded in order to compare it with GMYC species and estimate percentage match scores. Additionally, predefined groups that contribute to inflated species richness estimates are identified by SPEDE‐sampler, allowing for the further investigation of potential cryptic species or geographical substructuring in those groups. Merging by the GMYC is also recorded to identify where traditional taxonomy has overestimated species numbers. Four worked examples are provided to illustrate the functionality of the program's workflow, and the variation that can arise when applying the GMYC model to empirical data sets. The R Shiny program is available for download at https://github.com/clarkevansteenderen/spede_sampler_R.</abstract><cop>England</cop><pub>Wiley Subscription Services, Inc</pub><pmid>35094502</pmid><doi>10.1111/1755-0998.13591</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0003-2405-0945</orcidid><orcidid>https://orcid.org/0000-0002-4219-446X</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1755-098X
ispartof Molecular ecology resources, 2022-07, Vol.22 (5), p.2054-2069
issn 1755-098X
1755-0998
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9306842
source MEDLINE; Wiley Online Library Journals Frontfile Complete
subjects barcoding
Computer applications
Cryptic species
Datasets
DNA Barcoding, Taxonomic - methods
Ecotypes
New species
Phylogeny
Resource
RESOURCE ARTICLES
Sample Size
Sampling
singletons
species delimitation
Species richness
Taxonomy
Workflow
title SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T10%3A55%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SPEDE%E2%80%90sampler:%20An%20R%20Shiny%20application%20to%20assess%20how%20methodological%20choices%20and%20taxon%20sampling%20can%20affect%20Generalized%20Mixed%20Yule%20Coalescent%20output%20and%20interpretation&rft.jtitle=Molecular%20ecology%20resources&rft.au=Steenderen,%20Clarke%20J.%20M.&rft.date=2022-07&rft.volume=22&rft.issue=5&rft.spage=2054&rft.epage=2069&rft.pages=2054-2069&rft.issn=1755-098X&rft.eissn=1755-0998&rft_id=info:doi/10.1111/1755-0998.13591&rft_dat=%3Cproquest_pubme%3E2673565605%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2673565605&rft_id=info:pmid/35094502&rfr_iscdi=true