Computational Reproducibility of Molecular Phylogenies
Abstract Repeated runs of the same program can generate different molecular phylogenies from identical data sets under the same analytical conditions. This lack of reproducibility of inferred phylogenies casts a long shadow on downstream research employing these phylogenies in areas such as comparat...
Gespeichert in:
Veröffentlicht in: | Molecular biology and evolution 2023-07, Vol.40 (7) |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 7 |
container_start_page | |
container_title | Molecular biology and evolution |
container_volume | 40 |
creator | Kumar, Sudhir Tao, Qiqing Lamarca, Alessandra P Tamura, Koichiro |
description | Abstract
Repeated runs of the same program can generate different molecular phylogenies from identical data sets under the same analytical conditions. This lack of reproducibility of inferred phylogenies casts a long shadow on downstream research employing these phylogenies in areas such as comparative genomics, systematics, and functional biology. We have assessed the relative accuracies and log-likelihoods of alternative phylogenies generated for computer-simulated and empirical data sets. Our findings indicate that these alternative phylogenies reconstruct evolutionary relationships with comparable accuracy. They also have similar log-likelihoods that are not inferior to the log-likelihoods of the true tree. We determined that the direct relationship between irreproducibility and inaccuracy is due to their common dependence on the amount of phylogenetic information in the data. While computational reproducibility can be enhanced through more extensive heuristic searches for the maximum likelihood tree, this does not lead to higher accuracy. We conclude that computational irreproducibility plays a minor role in molecular phylogenetics. |
doi_str_mv | 10.1093/molbev/msad165 |
format | Article |
fullrecord | <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10370456</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A775149589</galeid><oup_id>10.1093/molbev/msad165</oup_id><sourcerecordid>A775149589</sourcerecordid><originalsourceid>FETCH-LOGICAL-c419t-1cfa75300ebd5ddf10783c81f5a73db7d6e18cc5e3286c87664816d0748203f3</originalsourceid><addsrcrecordid>eNqFkctLAzEQh4MoWh9Xj9KjHqpJ89yTSPEFiiK9h2wyqZHsZt3sFvrfu9Ja9CQ5TMh88zHhh9ApwZcEF_SqSrGE5VWVjSOC76AR4VROiCTFLhphOdwZpuoAHeb8gTFhTIh9dEAlE5JJOUJilqqm70wXUm3i-A2aNrnehjLE0K3GyY-fUwTbR9OOX99XMS2gDpCP0Z43McPJph6h-d3tfPYweXq5f5zdPE0sI0U3IdYbySnGUDrunCdYKmoV8dxI6krpBBBlLQc6VcIqKQRTRDgsmZpi6ukRul5rm76swFmou9ZE3bShMu1KJxP0304d3vUiLTXBVGLGxWA43xja9NlD7nQVsoUYTQ2pz3qqGJ4yTmkxoJdrdGEi6FD7NCjtcBxUwaYafBjeb6TkhBVc_Rqwbcq5Bb9djGD9nY5ep6M36QwDZ7-_s8V_4hiAizWQ-uY_2Rcr7pyj</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2840245339</pqid></control><display><type>article</type><title>Computational Reproducibility of Molecular Phylogenies</title><source>Oxford Journals Open Access Collection</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Kumar, Sudhir ; Tao, Qiqing ; Lamarca, Alessandra P ; Tamura, Koichiro</creator><contributor>Crandall, Keith</contributor><creatorcontrib>Kumar, Sudhir ; Tao, Qiqing ; Lamarca, Alessandra P ; Tamura, Koichiro ; Crandall, Keith</creatorcontrib><description>Abstract
Repeated runs of the same program can generate different molecular phylogenies from identical data sets under the same analytical conditions. This lack of reproducibility of inferred phylogenies casts a long shadow on downstream research employing these phylogenies in areas such as comparative genomics, systematics, and functional biology. We have assessed the relative accuracies and log-likelihoods of alternative phylogenies generated for computer-simulated and empirical data sets. Our findings indicate that these alternative phylogenies reconstruct evolutionary relationships with comparable accuracy. They also have similar log-likelihoods that are not inferior to the log-likelihoods of the true tree. We determined that the direct relationship between irreproducibility and inaccuracy is due to their common dependence on the amount of phylogenetic information in the data. While computational reproducibility can be enhanced through more extensive heuristic searches for the maximum likelihood tree, this does not lead to higher accuracy. We conclude that computational irreproducibility plays a minor role in molecular phylogenetics.</description><identifier>ISSN: 0737-4038</identifier><identifier>EISSN: 1537-1719</identifier><identifier>DOI: 10.1093/molbev/msad165</identifier><identifier>PMID: 37467477</identifier><language>eng</language><publisher>US: Oxford University Press</publisher><subject>Genomics ; Letter ; Phylogeny</subject><ispartof>Molecular biology and evolution, 2023-07, Vol.40 (7)</ispartof><rights>The Author(s) 2023. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution. 2023</rights><rights>The Author(s) 2023. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.</rights><rights>COPYRIGHT 2023 Oxford University Press</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c419t-1cfa75300ebd5ddf10783c81f5a73db7d6e18cc5e3286c87664816d0748203f3</cites><orcidid>0000-0002-9918-8212</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10370456/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10370456/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,1598,27903,27904,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37467477$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Crandall, Keith</contributor><creatorcontrib>Kumar, Sudhir</creatorcontrib><creatorcontrib>Tao, Qiqing</creatorcontrib><creatorcontrib>Lamarca, Alessandra P</creatorcontrib><creatorcontrib>Tamura, Koichiro</creatorcontrib><title>Computational Reproducibility of Molecular Phylogenies</title><title>Molecular biology and evolution</title><addtitle>Mol Biol Evol</addtitle><description>Abstract
Repeated runs of the same program can generate different molecular phylogenies from identical data sets under the same analytical conditions. This lack of reproducibility of inferred phylogenies casts a long shadow on downstream research employing these phylogenies in areas such as comparative genomics, systematics, and functional biology. We have assessed the relative accuracies and log-likelihoods of alternative phylogenies generated for computer-simulated and empirical data sets. Our findings indicate that these alternative phylogenies reconstruct evolutionary relationships with comparable accuracy. They also have similar log-likelihoods that are not inferior to the log-likelihoods of the true tree. We determined that the direct relationship between irreproducibility and inaccuracy is due to their common dependence on the amount of phylogenetic information in the data. While computational reproducibility can be enhanced through more extensive heuristic searches for the maximum likelihood tree, this does not lead to higher accuracy. We conclude that computational irreproducibility plays a minor role in molecular phylogenetics.</description><subject>Genomics</subject><subject>Letter</subject><subject>Phylogeny</subject><issn>0737-4038</issn><issn>1537-1719</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqFkctLAzEQh4MoWh9Xj9KjHqpJ89yTSPEFiiK9h2wyqZHsZt3sFvrfu9Ja9CQ5TMh88zHhh9ApwZcEF_SqSrGE5VWVjSOC76AR4VROiCTFLhphOdwZpuoAHeb8gTFhTIh9dEAlE5JJOUJilqqm70wXUm3i-A2aNrnehjLE0K3GyY-fUwTbR9OOX99XMS2gDpCP0Z43McPJph6h-d3tfPYweXq5f5zdPE0sI0U3IdYbySnGUDrunCdYKmoV8dxI6krpBBBlLQc6VcIqKQRTRDgsmZpi6ukRul5rm76swFmou9ZE3bShMu1KJxP0304d3vUiLTXBVGLGxWA43xja9NlD7nQVsoUYTQ2pz3qqGJ4yTmkxoJdrdGEi6FD7NCjtcBxUwaYafBjeb6TkhBVc_Rqwbcq5Bb9djGD9nY5ep6M36QwDZ7-_s8V_4hiAizWQ-uY_2Rcr7pyj</recordid><startdate>20230705</startdate><enddate>20230705</enddate><creator>Kumar, Sudhir</creator><creator>Tao, Qiqing</creator><creator>Lamarca, Alessandra P</creator><creator>Tamura, Koichiro</creator><general>Oxford University Press</general><scope>TOX</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-9918-8212</orcidid></search><sort><creationdate>20230705</creationdate><title>Computational Reproducibility of Molecular Phylogenies</title><author>Kumar, Sudhir ; Tao, Qiqing ; Lamarca, Alessandra P ; Tamura, Koichiro</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c419t-1cfa75300ebd5ddf10783c81f5a73db7d6e18cc5e3286c87664816d0748203f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Genomics</topic><topic>Letter</topic><topic>Phylogeny</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kumar, Sudhir</creatorcontrib><creatorcontrib>Tao, Qiqing</creatorcontrib><creatorcontrib>Lamarca, Alessandra P</creatorcontrib><creatorcontrib>Tamura, Koichiro</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Molecular biology and evolution</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kumar, Sudhir</au><au>Tao, Qiqing</au><au>Lamarca, Alessandra P</au><au>Tamura, Koichiro</au><au>Crandall, Keith</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Computational Reproducibility of Molecular Phylogenies</atitle><jtitle>Molecular biology and evolution</jtitle><addtitle>Mol Biol Evol</addtitle><date>2023-07-05</date><risdate>2023</risdate><volume>40</volume><issue>7</issue><issn>0737-4038</issn><eissn>1537-1719</eissn><abstract>Abstract
Repeated runs of the same program can generate different molecular phylogenies from identical data sets under the same analytical conditions. This lack of reproducibility of inferred phylogenies casts a long shadow on downstream research employing these phylogenies in areas such as comparative genomics, systematics, and functional biology. We have assessed the relative accuracies and log-likelihoods of alternative phylogenies generated for computer-simulated and empirical data sets. Our findings indicate that these alternative phylogenies reconstruct evolutionary relationships with comparable accuracy. They also have similar log-likelihoods that are not inferior to the log-likelihoods of the true tree. We determined that the direct relationship between irreproducibility and inaccuracy is due to their common dependence on the amount of phylogenetic information in the data. While computational reproducibility can be enhanced through more extensive heuristic searches for the maximum likelihood tree, this does not lead to higher accuracy. We conclude that computational irreproducibility plays a minor role in molecular phylogenetics.</abstract><cop>US</cop><pub>Oxford University Press</pub><pmid>37467477</pmid><doi>10.1093/molbev/msad165</doi><orcidid>https://orcid.org/0000-0002-9918-8212</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0737-4038 |
ispartof | Molecular biology and evolution, 2023-07, Vol.40 (7) |
issn | 0737-4038 1537-1719 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10370456 |
source | Oxford Journals Open Access Collection; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals; PubMed Central; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry |
subjects | Genomics Letter Phylogeny |
title | Computational Reproducibility of Molecular Phylogenies |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T13%3A59%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Computational%20Reproducibility%20of%20Molecular%20Phylogenies&rft.jtitle=Molecular%20biology%20and%20evolution&rft.au=Kumar,%20Sudhir&rft.date=2023-07-05&rft.volume=40&rft.issue=7&rft.issn=0737-4038&rft.eissn=1537-1719&rft_id=info:doi/10.1093/molbev/msad165&rft_dat=%3Cgale_pubme%3EA775149589%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2840245339&rft_id=info:pmid/37467477&rft_galeid=A775149589&rft_oup_id=10.1093/molbev/msad165&rfr_iscdi=true |