Identifying localized biases in large datasets: A case study using the avian tree of life
[Display omitted] •We corroborate many higher-level avian relationships from Hackett et al. (2008).•We compared total evidence (50-gene) analyses to independent analyses of published (19-gene) and novel (31-gene) data.•Independent analyses highlighted a small number of conflicts.•A single locus (FGB...
Gespeichert in:
Veröffentlicht in: | Molecular phylogenetics and evolution 2013-12, Vol.69 (3), p.1021-1032 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1032 |
---|---|
container_issue | 3 |
container_start_page | 1021 |
container_title | Molecular phylogenetics and evolution |
container_volume | 69 |
creator | Kimball, Rebecca T. Wang, Ning Heimer-McGinn, Victoria Ferguson, Carly Braun, Edward L. |
description | [Display omitted]
•We corroborate many higher-level avian relationships from Hackett et al. (2008).•We compared total evidence (50-gene) analyses to independent analyses of published (19-gene) and novel (31-gene) data.•Independent analyses highlighted a small number of conflicts.•A single locus (FGB) had a strong influence on the total evidence analysis.•A different locus (BDNF) had an affect on the species tree estimated from gene trees.
Large-scale multi-locus studies have become common in molecular phylogenetics, with new studies continually adding to previous datasets in an effort to fully resolve the tree of life. Total evidence analyses that combine existing data with newly collected data are expected to increase the power of phylogenetic analyses to resolve difficult relationships. However, they might be subject to localized biases, with one or a few loci having a strong and potentially misleading influence upon the results. To examine this possibility we combined a newly collected 31-locus dataset that includes representatives of all major avian lineages with a published dataset of 19 loci that has a comparable number of sites (Hackett et al., 2008. Science 320, 1763–1768). This allowed us to explore the advantages of conducting total evidence analyses, and to determine whether it was also important to analyze new datasets independent of published ones. The total evidence analysis yielded results very similar to the published results, with only slightly increased support at a few nodes. However, analyzing the 31- and 19-locus datasets separately highlighted several differences. Two clades received strong support in the published dataset and total evidence analysis, but the support appeared to reflect bias at a single locus (β-fibrinogen [FGB]). The signal in FGB that supported these relationships was sufficient to result in their recovery with bootstrap support, even when combined with 49 loci lacking that signal. FGB did not appear to have a substantial impact upon the results of species tree methods, but another locus (brain-derived neurotrophic factor [BDNF]) did have an impact upon those analyses. These results demonstrated that localized biases can influence large-scale phylogenetic analyses but they also indicated that considering independent evidence and exploring multiple analytical approaches could reveal them. |
doi_str_mv | 10.1016/j.ympev.2013.05.029 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1513485065</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1055790313002431</els_id><sourcerecordid>1513485065</sourcerecordid><originalsourceid>FETCH-LOGICAL-c482t-4a2c384cac487fc8982b8b7ab76b30050cf7547cfbb3d2e1424e0602d5abada63</originalsourceid><addsrcrecordid>eNqFkMuLFDEQh4Mo7kP_AkFz3Eu3lVc_BA_LouvCggfdg6eQR_WYoad7TNID419v2lk96imV4vtVFR8hrxjUDFjzdlsfd3s81ByYqEHVwPsn5JxBr6peMfF0rZWq2h7EGblIaQvAmOrVc3LGRduzXnbn5NudxymH4RimDR1nZ8bwEz21wSRMNEx0NHGD1JtcGjm9o9fUlYqmvPgjXdIay9-RmkMwE80Rkc4DHcOAL8izwYwJXz6-l-Th44evN5-q-8-3dzfX95WTHc-VNNyJTjpTvu3gur7jtrOtsW1jBYACN7RKtm6wVniOTHKJ0AD3yljjTSMuydVp7j7OPxZMWe9CcjiOZsJ5SZoVF7JT0Kj_o1J0qlVCrKg4oS7OKUUc9D6GnYlHzUCv-vVW_9avV_0alC76S-r144LF7tD_zfzxXYA3J2AwszabGJJ--FImKACQgjWiEO9PBBZnh4BRJxdwcuhDRJe1n8M_T_gFWJqgMw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1438575335</pqid></control><display><type>article</type><title>Identifying localized biases in large datasets: A case study using the avian tree of life</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><creator>Kimball, Rebecca T. ; Wang, Ning ; Heimer-McGinn, Victoria ; Ferguson, Carly ; Braun, Edward L.</creator><creatorcontrib>Kimball, Rebecca T. ; Wang, Ning ; Heimer-McGinn, Victoria ; Ferguson, Carly ; Braun, Edward L.</creatorcontrib><description>[Display omitted]
•We corroborate many higher-level avian relationships from Hackett et al. (2008).•We compared total evidence (50-gene) analyses to independent analyses of published (19-gene) and novel (31-gene) data.•Independent analyses highlighted a small number of conflicts.•A single locus (FGB) had a strong influence on the total evidence analysis.•A different locus (BDNF) had an affect on the species tree estimated from gene trees.
Large-scale multi-locus studies have become common in molecular phylogenetics, with new studies continually adding to previous datasets in an effort to fully resolve the tree of life. Total evidence analyses that combine existing data with newly collected data are expected to increase the power of phylogenetic analyses to resolve difficult relationships. However, they might be subject to localized biases, with one or a few loci having a strong and potentially misleading influence upon the results. To examine this possibility we combined a newly collected 31-locus dataset that includes representatives of all major avian lineages with a published dataset of 19 loci that has a comparable number of sites (Hackett et al., 2008. Science 320, 1763–1768). This allowed us to explore the advantages of conducting total evidence analyses, and to determine whether it was also important to analyze new datasets independent of published ones. The total evidence analysis yielded results very similar to the published results, with only slightly increased support at a few nodes. However, analyzing the 31- and 19-locus datasets separately highlighted several differences. Two clades received strong support in the published dataset and total evidence analysis, but the support appeared to reflect bias at a single locus (β-fibrinogen [FGB]). The signal in FGB that supported these relationships was sufficient to result in their recovery with bootstrap support, even when combined with 49 loci lacking that signal. FGB did not appear to have a substantial impact upon the results of species tree methods, but another locus (brain-derived neurotrophic factor [BDNF]) did have an impact upon those analyses. These results demonstrated that localized biases can influence large-scale phylogenetic analyses but they also indicated that considering independent evidence and exploring multiple analytical approaches could reveal them.</description><identifier>ISSN: 1055-7903</identifier><identifier>EISSN: 1095-9513</identifier><identifier>DOI: 10.1016/j.ympev.2013.05.029</identifier><identifier>PMID: 23791948</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Animals ; Bias ; Biological Evolution ; birds ; Birds - classification ; Birds - genetics ; case studies ; Cladding ; Conduction ; data collection ; Evolution ; Gene tree discordance ; Impact analysis ; Incongruence ; Likelihood Functions ; Localized biases ; Loci ; Mathematical analysis ; Models, Genetic ; Phylogenomics ; Phylogeny ; Sequence Alignment ; Sequence Analysis, DNA ; Trees</subject><ispartof>Molecular phylogenetics and evolution, 2013-12, Vol.69 (3), p.1021-1032</ispartof><rights>2013 Elsevier Inc.</rights><rights>Copyright © 2013 Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c482t-4a2c384cac487fc8982b8b7ab76b30050cf7547cfbb3d2e1424e0602d5abada63</citedby><cites>FETCH-LOGICAL-c482t-4a2c384cac487fc8982b8b7ab76b30050cf7547cfbb3d2e1424e0602d5abada63</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1055790313002431$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27903,27904,65309</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23791948$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Kimball, Rebecca T.</creatorcontrib><creatorcontrib>Wang, Ning</creatorcontrib><creatorcontrib>Heimer-McGinn, Victoria</creatorcontrib><creatorcontrib>Ferguson, Carly</creatorcontrib><creatorcontrib>Braun, Edward L.</creatorcontrib><title>Identifying localized biases in large datasets: A case study using the avian tree of life</title><title>Molecular phylogenetics and evolution</title><addtitle>Mol Phylogenet Evol</addtitle><description>[Display omitted]
•We corroborate many higher-level avian relationships from Hackett et al. (2008).•We compared total evidence (50-gene) analyses to independent analyses of published (19-gene) and novel (31-gene) data.•Independent analyses highlighted a small number of conflicts.•A single locus (FGB) had a strong influence on the total evidence analysis.•A different locus (BDNF) had an affect on the species tree estimated from gene trees.
Large-scale multi-locus studies have become common in molecular phylogenetics, with new studies continually adding to previous datasets in an effort to fully resolve the tree of life. Total evidence analyses that combine existing data with newly collected data are expected to increase the power of phylogenetic analyses to resolve difficult relationships. However, they might be subject to localized biases, with one or a few loci having a strong and potentially misleading influence upon the results. To examine this possibility we combined a newly collected 31-locus dataset that includes representatives of all major avian lineages with a published dataset of 19 loci that has a comparable number of sites (Hackett et al., 2008. Science 320, 1763–1768). This allowed us to explore the advantages of conducting total evidence analyses, and to determine whether it was also important to analyze new datasets independent of published ones. The total evidence analysis yielded results very similar to the published results, with only slightly increased support at a few nodes. However, analyzing the 31- and 19-locus datasets separately highlighted several differences. Two clades received strong support in the published dataset and total evidence analysis, but the support appeared to reflect bias at a single locus (β-fibrinogen [FGB]). The signal in FGB that supported these relationships was sufficient to result in their recovery with bootstrap support, even when combined with 49 loci lacking that signal. FGB did not appear to have a substantial impact upon the results of species tree methods, but another locus (brain-derived neurotrophic factor [BDNF]) did have an impact upon those analyses. These results demonstrated that localized biases can influence large-scale phylogenetic analyses but they also indicated that considering independent evidence and exploring multiple analytical approaches could reveal them.</description><subject>Animals</subject><subject>Bias</subject><subject>Biological Evolution</subject><subject>birds</subject><subject>Birds - classification</subject><subject>Birds - genetics</subject><subject>case studies</subject><subject>Cladding</subject><subject>Conduction</subject><subject>data collection</subject><subject>Evolution</subject><subject>Gene tree discordance</subject><subject>Impact analysis</subject><subject>Incongruence</subject><subject>Likelihood Functions</subject><subject>Localized biases</subject><subject>Loci</subject><subject>Mathematical analysis</subject><subject>Models, Genetic</subject><subject>Phylogenomics</subject><subject>Phylogeny</subject><subject>Sequence Alignment</subject><subject>Sequence Analysis, DNA</subject><subject>Trees</subject><issn>1055-7903</issn><issn>1095-9513</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkMuLFDEQh4Mo7kP_AkFz3Eu3lVc_BA_LouvCggfdg6eQR_WYoad7TNID419v2lk96imV4vtVFR8hrxjUDFjzdlsfd3s81ByYqEHVwPsn5JxBr6peMfF0rZWq2h7EGblIaQvAmOrVc3LGRduzXnbn5NudxymH4RimDR1nZ8bwEz21wSRMNEx0NHGD1JtcGjm9o9fUlYqmvPgjXdIay9-RmkMwE80Rkc4DHcOAL8izwYwJXz6-l-Th44evN5-q-8-3dzfX95WTHc-VNNyJTjpTvu3gur7jtrOtsW1jBYACN7RKtm6wVniOTHKJ0AD3yljjTSMuydVp7j7OPxZMWe9CcjiOZsJ5SZoVF7JT0Kj_o1J0qlVCrKg4oS7OKUUc9D6GnYlHzUCv-vVW_9avV_0alC76S-r144LF7tD_zfzxXYA3J2AwszabGJJ--FImKACQgjWiEO9PBBZnh4BRJxdwcuhDRJe1n8M_T_gFWJqgMw</recordid><startdate>20131201</startdate><enddate>20131201</enddate><creator>Kimball, Rebecca T.</creator><creator>Wang, Ning</creator><creator>Heimer-McGinn, Victoria</creator><creator>Ferguson, Carly</creator><creator>Braun, Edward L.</creator><general>Elsevier Inc</general><scope>FBQ</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7U5</scope><scope>8FD</scope><scope>L7M</scope></search><sort><creationdate>20131201</creationdate><title>Identifying localized biases in large datasets: A case study using the avian tree of life</title><author>Kimball, Rebecca T. ; Wang, Ning ; Heimer-McGinn, Victoria ; Ferguson, Carly ; Braun, Edward L.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c482t-4a2c384cac487fc8982b8b7ab76b30050cf7547cfbb3d2e1424e0602d5abada63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Animals</topic><topic>Bias</topic><topic>Biological Evolution</topic><topic>birds</topic><topic>Birds - classification</topic><topic>Birds - genetics</topic><topic>case studies</topic><topic>Cladding</topic><topic>Conduction</topic><topic>data collection</topic><topic>Evolution</topic><topic>Gene tree discordance</topic><topic>Impact analysis</topic><topic>Incongruence</topic><topic>Likelihood Functions</topic><topic>Localized biases</topic><topic>Loci</topic><topic>Mathematical analysis</topic><topic>Models, Genetic</topic><topic>Phylogenomics</topic><topic>Phylogeny</topic><topic>Sequence Alignment</topic><topic>Sequence Analysis, DNA</topic><topic>Trees</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kimball, Rebecca T.</creatorcontrib><creatorcontrib>Wang, Ning</creatorcontrib><creatorcontrib>Heimer-McGinn, Victoria</creatorcontrib><creatorcontrib>Ferguson, Carly</creatorcontrib><creatorcontrib>Braun, Edward L.</creatorcontrib><collection>AGRIS</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>Molecular phylogenetics and evolution</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kimball, Rebecca T.</au><au>Wang, Ning</au><au>Heimer-McGinn, Victoria</au><au>Ferguson, Carly</au><au>Braun, Edward L.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Identifying localized biases in large datasets: A case study using the avian tree of life</atitle><jtitle>Molecular phylogenetics and evolution</jtitle><addtitle>Mol Phylogenet Evol</addtitle><date>2013-12-01</date><risdate>2013</risdate><volume>69</volume><issue>3</issue><spage>1021</spage><epage>1032</epage><pages>1021-1032</pages><issn>1055-7903</issn><eissn>1095-9513</eissn><abstract>[Display omitted]
•We corroborate many higher-level avian relationships from Hackett et al. (2008).•We compared total evidence (50-gene) analyses to independent analyses of published (19-gene) and novel (31-gene) data.•Independent analyses highlighted a small number of conflicts.•A single locus (FGB) had a strong influence on the total evidence analysis.•A different locus (BDNF) had an affect on the species tree estimated from gene trees.
Large-scale multi-locus studies have become common in molecular phylogenetics, with new studies continually adding to previous datasets in an effort to fully resolve the tree of life. Total evidence analyses that combine existing data with newly collected data are expected to increase the power of phylogenetic analyses to resolve difficult relationships. However, they might be subject to localized biases, with one or a few loci having a strong and potentially misleading influence upon the results. To examine this possibility we combined a newly collected 31-locus dataset that includes representatives of all major avian lineages with a published dataset of 19 loci that has a comparable number of sites (Hackett et al., 2008. Science 320, 1763–1768). This allowed us to explore the advantages of conducting total evidence analyses, and to determine whether it was also important to analyze new datasets independent of published ones. The total evidence analysis yielded results very similar to the published results, with only slightly increased support at a few nodes. However, analyzing the 31- and 19-locus datasets separately highlighted several differences. Two clades received strong support in the published dataset and total evidence analysis, but the support appeared to reflect bias at a single locus (β-fibrinogen [FGB]). The signal in FGB that supported these relationships was sufficient to result in their recovery with bootstrap support, even when combined with 49 loci lacking that signal. FGB did not appear to have a substantial impact upon the results of species tree methods, but another locus (brain-derived neurotrophic factor [BDNF]) did have an impact upon those analyses. These results demonstrated that localized biases can influence large-scale phylogenetic analyses but they also indicated that considering independent evidence and exploring multiple analytical approaches could reveal them.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>23791948</pmid><doi>10.1016/j.ympev.2013.05.029</doi><tpages>12</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1055-7903 |
ispartof | Molecular phylogenetics and evolution, 2013-12, Vol.69 (3), p.1021-1032 |
issn | 1055-7903 1095-9513 |
language | eng |
recordid | cdi_proquest_miscellaneous_1513485065 |
source | MEDLINE; Elsevier ScienceDirect Journals |
subjects | Animals Bias Biological Evolution birds Birds - classification Birds - genetics case studies Cladding Conduction data collection Evolution Gene tree discordance Impact analysis Incongruence Likelihood Functions Localized biases Loci Mathematical analysis Models, Genetic Phylogenomics Phylogeny Sequence Alignment Sequence Analysis, DNA Trees |
title | Identifying localized biases in large datasets: A case study using the avian tree of life |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T06%3A38%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Identifying%20localized%20biases%20in%20large%20datasets:%20A%20case%20study%20using%20the%20avian%20tree%20of%20life&rft.jtitle=Molecular%20phylogenetics%20and%20evolution&rft.au=Kimball,%20Rebecca%20T.&rft.date=2013-12-01&rft.volume=69&rft.issue=3&rft.spage=1021&rft.epage=1032&rft.pages=1021-1032&rft.issn=1055-7903&rft.eissn=1095-9513&rft_id=info:doi/10.1016/j.ympev.2013.05.029&rft_dat=%3Cproquest_cross%3E1513485065%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1438575335&rft_id=info:pmid/23791948&rft_els_id=S1055790313002431&rfr_iscdi=true |