The impact of estimator choice: Disagreement in clustering solutions across K estimators for Bayesian analysis of population genetic structure across a wide range of empirical datasets

The software program STRUCTURE is one of the most cited tools for determining population structure. To infer the optimal number of clusters from STRUCTURE output, the ΔK method is often applied. However, a recent study relying on simulated microsatellite data suggested that this method has a downwar...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Molecular ecology resources 2021-10
Hauptverfasser: Stankiewicz, Kathryn, Vasquez Kuntz, Kate, Ledoux, Jean-Baptiste, Aurelle, D., Garrabou, Joaquim, Nakajima, Yuichi, Dahl, Mikael, Zayasu, Yuna, Jaziri, Sabri, Costantini, Federica, Baums, Iliana
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title Molecular ecology resources
container_volume
creator Stankiewicz, Kathryn
Vasquez Kuntz, Kate
Ledoux, Jean-Baptiste
Aurelle, D.
Garrabou, Joaquim
Nakajima, Yuichi
Dahl, Mikael
Zayasu, Yuna
Jaziri, Sabri
Costantini, Federica
Baums, Iliana
description The software program STRUCTURE is one of the most cited tools for determining population structure. To infer the optimal number of clusters from STRUCTURE output, the ΔK method is often applied. However, a recent study relying on simulated microsatellite data suggested that this method has a downward bias in its estimation of K and is sensitive to uneven sampling. If this finding holds for empirical datasets, conclusions about the scale of gene flow may have to be revised for a large number of studies. To determine the impact of method choice, we applied recently described estimators of K to re-estimate genetic structure in 41 empirical microsatellite datasets; 15 from a broad range of taxa and 26 focused on a diverse phylogenetic group, coral. We compared alternative estimates of K (Puechmaille statistics) with traditional (ΔK and posterior probability) estimates and found widespread disagreement of estimators across datasets. Thus, one estimator alone is insufficient for determining the optimal number of clusters regardless of study organism or evenness of sampling scheme. Subsequent analysis of molecular variance (AMOVA) between clustering solutions did not necessarily clarify which solution was best. To better infer population structure, we suggest a combination of visual inspection of STRUCTURE plots and calculation of the alternative estimators at various thresholds in addition to ΔK. Differences between estimators could reveal patterns with important biological implications, such as the potential for more population structure than previously estimated, as was the case for many studies reanalyzed here.
doi_str_mv 10.1111/1755-0998.13522
format Article
fullrecord <record><control><sourceid>hal</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_03372665v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>oai_HAL_hal_03372665v1</sourcerecordid><originalsourceid>FETCH-hal_primary_oai_HAL_hal_03372665v13</originalsourceid><addsrcrecordid>eNqVTrlOw0AQXSGQCEdNOy1Fgg9sx3ScigRlCjprtBnbg9a71s4a5D_j87ARhJpp3ujpXUpdxNEqnu4qLrJsGZXlehWnWZIcqMWeOdz_69djdSLyFkV5VBbXC_W5bQm461EHcDWQBO4wOA-6dazpBh5YsPFEHdkAbEGbQQJ5tg2IM0NgZwVQeycCz39-gXoKucORhNECWjSjsMwdvesHg7MRGrIUWIMEP-gwePpNQvjgHYFH29D3rq5nzxoN7DCgUJAzdVSjETr_wVN1-fS4vd8sWzRV76cVfqwccrW5falmLkrTIsnz7D1O_6P9AoTIcOM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>The impact of estimator choice: Disagreement in clustering solutions across K estimators for Bayesian analysis of population genetic structure across a wide range of empirical datasets</title><source>Wiley Online Library All Journals</source><creator>Stankiewicz, Kathryn ; Vasquez Kuntz, Kate ; Ledoux, Jean-Baptiste ; Aurelle, D. ; Garrabou, Joaquim ; Nakajima, Yuichi ; Dahl, Mikael ; Zayasu, Yuna ; Jaziri, Sabri ; Costantini, Federica ; Baums, Iliana</creator><creatorcontrib>Stankiewicz, Kathryn ; Vasquez Kuntz, Kate ; Ledoux, Jean-Baptiste ; Aurelle, D. ; Garrabou, Joaquim ; Nakajima, Yuichi ; Dahl, Mikael ; Zayasu, Yuna ; Jaziri, Sabri ; Costantini, Federica ; Baums, Iliana</creatorcontrib><description>The software program STRUCTURE is one of the most cited tools for determining population structure. To infer the optimal number of clusters from STRUCTURE output, the ΔK method is often applied. However, a recent study relying on simulated microsatellite data suggested that this method has a downward bias in its estimation of K and is sensitive to uneven sampling. If this finding holds for empirical datasets, conclusions about the scale of gene flow may have to be revised for a large number of studies. To determine the impact of method choice, we applied recently described estimators of K to re-estimate genetic structure in 41 empirical microsatellite datasets; 15 from a broad range of taxa and 26 focused on a diverse phylogenetic group, coral. We compared alternative estimates of K (Puechmaille statistics) with traditional (ΔK and posterior probability) estimates and found widespread disagreement of estimators across datasets. Thus, one estimator alone is insufficient for determining the optimal number of clusters regardless of study organism or evenness of sampling scheme. Subsequent analysis of molecular variance (AMOVA) between clustering solutions did not necessarily clarify which solution was best. To better infer population structure, we suggest a combination of visual inspection of STRUCTURE plots and calculation of the alternative estimators at various thresholds in addition to ΔK. Differences between estimators could reveal patterns with important biological implications, such as the potential for more population structure than previously estimated, as was the case for many studies reanalyzed here.</description><identifier>ISSN: 1755-098X</identifier><identifier>EISSN: 1755-0998</identifier><identifier>DOI: 10.1111/1755-0998.13522</identifier><language>eng</language><publisher>Wiley/Blackwell</publisher><subject>Biodiversity and Ecology ; Environmental Sciences</subject><ispartof>Molecular ecology resources, 2021-10</ispartof><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0002-3922-7291 ; 0000-0001-6463-7308 ; 0000-0001-6463-7308 ; 0000-0002-3922-7291</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,27923,27924</link.rule.ids><backlink>$$Uhttps://hal.science/hal-03372665$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Stankiewicz, Kathryn</creatorcontrib><creatorcontrib>Vasquez Kuntz, Kate</creatorcontrib><creatorcontrib>Ledoux, Jean-Baptiste</creatorcontrib><creatorcontrib>Aurelle, D.</creatorcontrib><creatorcontrib>Garrabou, Joaquim</creatorcontrib><creatorcontrib>Nakajima, Yuichi</creatorcontrib><creatorcontrib>Dahl, Mikael</creatorcontrib><creatorcontrib>Zayasu, Yuna</creatorcontrib><creatorcontrib>Jaziri, Sabri</creatorcontrib><creatorcontrib>Costantini, Federica</creatorcontrib><creatorcontrib>Baums, Iliana</creatorcontrib><title>The impact of estimator choice: Disagreement in clustering solutions across K estimators for Bayesian analysis of population genetic structure across a wide range of empirical datasets</title><title>Molecular ecology resources</title><description>The software program STRUCTURE is one of the most cited tools for determining population structure. To infer the optimal number of clusters from STRUCTURE output, the ΔK method is often applied. However, a recent study relying on simulated microsatellite data suggested that this method has a downward bias in its estimation of K and is sensitive to uneven sampling. If this finding holds for empirical datasets, conclusions about the scale of gene flow may have to be revised for a large number of studies. To determine the impact of method choice, we applied recently described estimators of K to re-estimate genetic structure in 41 empirical microsatellite datasets; 15 from a broad range of taxa and 26 focused on a diverse phylogenetic group, coral. We compared alternative estimates of K (Puechmaille statistics) with traditional (ΔK and posterior probability) estimates and found widespread disagreement of estimators across datasets. Thus, one estimator alone is insufficient for determining the optimal number of clusters regardless of study organism or evenness of sampling scheme. Subsequent analysis of molecular variance (AMOVA) between clustering solutions did not necessarily clarify which solution was best. To better infer population structure, we suggest a combination of visual inspection of STRUCTURE plots and calculation of the alternative estimators at various thresholds in addition to ΔK. Differences between estimators could reveal patterns with important biological implications, such as the potential for more population structure than previously estimated, as was the case for many studies reanalyzed here.</description><subject>Biodiversity and Ecology</subject><subject>Environmental Sciences</subject><issn>1755-098X</issn><issn>1755-0998</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNqVTrlOw0AQXSGQCEdNOy1Fgg9sx3ScigRlCjprtBnbg9a71s4a5D_j87ARhJpp3ujpXUpdxNEqnu4qLrJsGZXlehWnWZIcqMWeOdz_69djdSLyFkV5VBbXC_W5bQm461EHcDWQBO4wOA-6dazpBh5YsPFEHdkAbEGbQQJ5tg2IM0NgZwVQeycCz39-gXoKucORhNECWjSjsMwdvesHg7MRGrIUWIMEP-gwePpNQvjgHYFH29D3rq5nzxoN7DCgUJAzdVSjETr_wVN1-fS4vd8sWzRV76cVfqwccrW5falmLkrTIsnz7D1O_6P9AoTIcOM</recordid><startdate>20211001</startdate><enddate>20211001</enddate><creator>Stankiewicz, Kathryn</creator><creator>Vasquez Kuntz, Kate</creator><creator>Ledoux, Jean-Baptiste</creator><creator>Aurelle, D.</creator><creator>Garrabou, Joaquim</creator><creator>Nakajima, Yuichi</creator><creator>Dahl, Mikael</creator><creator>Zayasu, Yuna</creator><creator>Jaziri, Sabri</creator><creator>Costantini, Federica</creator><creator>Baums, Iliana</creator><general>Wiley/Blackwell</general><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0002-3922-7291</orcidid><orcidid>https://orcid.org/0000-0001-6463-7308</orcidid><orcidid>https://orcid.org/0000-0001-6463-7308</orcidid><orcidid>https://orcid.org/0000-0002-3922-7291</orcidid></search><sort><creationdate>20211001</creationdate><title>The impact of estimator choice: Disagreement in clustering solutions across K estimators for Bayesian analysis of population genetic structure across a wide range of empirical datasets</title><author>Stankiewicz, Kathryn ; Vasquez Kuntz, Kate ; Ledoux, Jean-Baptiste ; Aurelle, D. ; Garrabou, Joaquim ; Nakajima, Yuichi ; Dahl, Mikael ; Zayasu, Yuna ; Jaziri, Sabri ; Costantini, Federica ; Baums, Iliana</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-hal_primary_oai_HAL_hal_03372665v13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Biodiversity and Ecology</topic><topic>Environmental Sciences</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Stankiewicz, Kathryn</creatorcontrib><creatorcontrib>Vasquez Kuntz, Kate</creatorcontrib><creatorcontrib>Ledoux, Jean-Baptiste</creatorcontrib><creatorcontrib>Aurelle, D.</creatorcontrib><creatorcontrib>Garrabou, Joaquim</creatorcontrib><creatorcontrib>Nakajima, Yuichi</creatorcontrib><creatorcontrib>Dahl, Mikael</creatorcontrib><creatorcontrib>Zayasu, Yuna</creatorcontrib><creatorcontrib>Jaziri, Sabri</creatorcontrib><creatorcontrib>Costantini, Federica</creatorcontrib><creatorcontrib>Baums, Iliana</creatorcontrib><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><jtitle>Molecular ecology resources</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Stankiewicz, Kathryn</au><au>Vasquez Kuntz, Kate</au><au>Ledoux, Jean-Baptiste</au><au>Aurelle, D.</au><au>Garrabou, Joaquim</au><au>Nakajima, Yuichi</au><au>Dahl, Mikael</au><au>Zayasu, Yuna</au><au>Jaziri, Sabri</au><au>Costantini, Federica</au><au>Baums, Iliana</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The impact of estimator choice: Disagreement in clustering solutions across K estimators for Bayesian analysis of population genetic structure across a wide range of empirical datasets</atitle><jtitle>Molecular ecology resources</jtitle><date>2021-10-01</date><risdate>2021</risdate><issn>1755-098X</issn><eissn>1755-0998</eissn><abstract>The software program STRUCTURE is one of the most cited tools for determining population structure. To infer the optimal number of clusters from STRUCTURE output, the ΔK method is often applied. However, a recent study relying on simulated microsatellite data suggested that this method has a downward bias in its estimation of K and is sensitive to uneven sampling. If this finding holds for empirical datasets, conclusions about the scale of gene flow may have to be revised for a large number of studies. To determine the impact of method choice, we applied recently described estimators of K to re-estimate genetic structure in 41 empirical microsatellite datasets; 15 from a broad range of taxa and 26 focused on a diverse phylogenetic group, coral. We compared alternative estimates of K (Puechmaille statistics) with traditional (ΔK and posterior probability) estimates and found widespread disagreement of estimators across datasets. Thus, one estimator alone is insufficient for determining the optimal number of clusters regardless of study organism or evenness of sampling scheme. Subsequent analysis of molecular variance (AMOVA) between clustering solutions did not necessarily clarify which solution was best. To better infer population structure, we suggest a combination of visual inspection of STRUCTURE plots and calculation of the alternative estimators at various thresholds in addition to ΔK. Differences between estimators could reveal patterns with important biological implications, such as the potential for more population structure than previously estimated, as was the case for many studies reanalyzed here.</abstract><pub>Wiley/Blackwell</pub><doi>10.1111/1755-0998.13522</doi><orcidid>https://orcid.org/0000-0002-3922-7291</orcidid><orcidid>https://orcid.org/0000-0001-6463-7308</orcidid><orcidid>https://orcid.org/0000-0001-6463-7308</orcidid><orcidid>https://orcid.org/0000-0002-3922-7291</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1755-098X
ispartof Molecular ecology resources, 2021-10
issn 1755-098X
1755-0998
language eng
recordid cdi_hal_primary_oai_HAL_hal_03372665v1
source Wiley Online Library All Journals
subjects Biodiversity and Ecology
Environmental Sciences
title The impact of estimator choice: Disagreement in clustering solutions across K estimators for Bayesian analysis of population genetic structure across a wide range of empirical datasets
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T00%3A19%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-hal&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20impact%20of%20estimator%20choice:%20Disagreement%20in%20clustering%20solutions%20across%20K%20estimators%20for%20Bayesian%20analysis%20of%20population%20genetic%20structure%20across%20a%20wide%20range%20of%20empirical%20datasets&rft.jtitle=Molecular%20ecology%20resources&rft.au=Stankiewicz,%20Kathryn&rft.date=2021-10-01&rft.issn=1755-098X&rft.eissn=1755-0998&rft_id=info:doi/10.1111/1755-0998.13522&rft_dat=%3Chal%3Eoai_HAL_hal_03372665v1%3C/hal%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true