Computational Reproducibility of Molecular Phylogenies

Abstract Repeated runs of the same program can generate different molecular phylogenies from identical data sets under the same analytical conditions. This lack of reproducibility of inferred phylogenies casts a long shadow on downstream research employing these phylogenies in areas such as comparat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Molecular biology and evolution 2023-07, Vol.40 (7)
Hauptverfasser: Kumar, Sudhir, Tao, Qiqing, Lamarca, Alessandra P, Tamura, Koichiro
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 7
container_start_page
container_title Molecular biology and evolution
container_volume 40
creator Kumar, Sudhir
Tao, Qiqing
Lamarca, Alessandra P
Tamura, Koichiro
description Abstract Repeated runs of the same program can generate different molecular phylogenies from identical data sets under the same analytical conditions. This lack of reproducibility of inferred phylogenies casts a long shadow on downstream research employing these phylogenies in areas such as comparative genomics, systematics, and functional biology. We have assessed the relative accuracies and log-likelihoods of alternative phylogenies generated for computer-simulated and empirical data sets. Our findings indicate that these alternative phylogenies reconstruct evolutionary relationships with comparable accuracy. They also have similar log-likelihoods that are not inferior to the log-likelihoods of the true tree. We determined that the direct relationship between irreproducibility and inaccuracy is due to their common dependence on the amount of phylogenetic information in the data. While computational reproducibility can be enhanced through more extensive heuristic searches for the maximum likelihood tree, this does not lead to higher accuracy. We conclude that computational irreproducibility plays a minor role in molecular phylogenetics.
doi_str_mv 10.1093/molbev/msad165
format Article
fullrecord <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10370456</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A775149589</galeid><oup_id>10.1093/molbev/msad165</oup_id><sourcerecordid>A775149589</sourcerecordid><originalsourceid>FETCH-LOGICAL-c419t-1cfa75300ebd5ddf10783c81f5a73db7d6e18cc5e3286c87664816d0748203f3</originalsourceid><addsrcrecordid>eNqFkctLAzEQh4MoWh9Xj9KjHqpJ89yTSPEFiiK9h2wyqZHsZt3sFvrfu9Ja9CQ5TMh88zHhh9ApwZcEF_SqSrGE5VWVjSOC76AR4VROiCTFLhphOdwZpuoAHeb8gTFhTIh9dEAlE5JJOUJilqqm70wXUm3i-A2aNrnehjLE0K3GyY-fUwTbR9OOX99XMS2gDpCP0Z43McPJph6h-d3tfPYweXq5f5zdPE0sI0U3IdYbySnGUDrunCdYKmoV8dxI6krpBBBlLQc6VcIqKQRTRDgsmZpi6ukRul5rm76swFmou9ZE3bShMu1KJxP0304d3vUiLTXBVGLGxWA43xja9NlD7nQVsoUYTQ2pz3qqGJ4yTmkxoJdrdGEi6FD7NCjtcBxUwaYafBjeb6TkhBVc_Rqwbcq5Bb9djGD9nY5ep6M36QwDZ7-_s8V_4hiAizWQ-uY_2Rcr7pyj</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2840245339</pqid></control><display><type>article</type><title>Computational Reproducibility of Molecular Phylogenies</title><source>Oxford Journals Open Access Collection</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Kumar, Sudhir ; Tao, Qiqing ; Lamarca, Alessandra P ; Tamura, Koichiro</creator><contributor>Crandall, Keith</contributor><creatorcontrib>Kumar, Sudhir ; Tao, Qiqing ; Lamarca, Alessandra P ; Tamura, Koichiro ; Crandall, Keith</creatorcontrib><description>Abstract Repeated runs of the same program can generate different molecular phylogenies from identical data sets under the same analytical conditions. This lack of reproducibility of inferred phylogenies casts a long shadow on downstream research employing these phylogenies in areas such as comparative genomics, systematics, and functional biology. We have assessed the relative accuracies and log-likelihoods of alternative phylogenies generated for computer-simulated and empirical data sets. Our findings indicate that these alternative phylogenies reconstruct evolutionary relationships with comparable accuracy. They also have similar log-likelihoods that are not inferior to the log-likelihoods of the true tree. We determined that the direct relationship between irreproducibility and inaccuracy is due to their common dependence on the amount of phylogenetic information in the data. While computational reproducibility can be enhanced through more extensive heuristic searches for the maximum likelihood tree, this does not lead to higher accuracy. We conclude that computational irreproducibility plays a minor role in molecular phylogenetics.</description><identifier>ISSN: 0737-4038</identifier><identifier>EISSN: 1537-1719</identifier><identifier>DOI: 10.1093/molbev/msad165</identifier><identifier>PMID: 37467477</identifier><language>eng</language><publisher>US: Oxford University Press</publisher><subject>Genomics ; Letter ; Phylogeny</subject><ispartof>Molecular biology and evolution, 2023-07, Vol.40 (7)</ispartof><rights>The Author(s) 2023. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution. 2023</rights><rights>The Author(s) 2023. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.</rights><rights>COPYRIGHT 2023 Oxford University Press</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c419t-1cfa75300ebd5ddf10783c81f5a73db7d6e18cc5e3286c87664816d0748203f3</cites><orcidid>0000-0002-9918-8212</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10370456/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10370456/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,1598,27903,27904,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37467477$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Crandall, Keith</contributor><creatorcontrib>Kumar, Sudhir</creatorcontrib><creatorcontrib>Tao, Qiqing</creatorcontrib><creatorcontrib>Lamarca, Alessandra P</creatorcontrib><creatorcontrib>Tamura, Koichiro</creatorcontrib><title>Computational Reproducibility of Molecular Phylogenies</title><title>Molecular biology and evolution</title><addtitle>Mol Biol Evol</addtitle><description>Abstract Repeated runs of the same program can generate different molecular phylogenies from identical data sets under the same analytical conditions. This lack of reproducibility of inferred phylogenies casts a long shadow on downstream research employing these phylogenies in areas such as comparative genomics, systematics, and functional biology. We have assessed the relative accuracies and log-likelihoods of alternative phylogenies generated for computer-simulated and empirical data sets. Our findings indicate that these alternative phylogenies reconstruct evolutionary relationships with comparable accuracy. They also have similar log-likelihoods that are not inferior to the log-likelihoods of the true tree. We determined that the direct relationship between irreproducibility and inaccuracy is due to their common dependence on the amount of phylogenetic information in the data. While computational reproducibility can be enhanced through more extensive heuristic searches for the maximum likelihood tree, this does not lead to higher accuracy. We conclude that computational irreproducibility plays a minor role in molecular phylogenetics.</description><subject>Genomics</subject><subject>Letter</subject><subject>Phylogeny</subject><issn>0737-4038</issn><issn>1537-1719</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqFkctLAzEQh4MoWh9Xj9KjHqpJ89yTSPEFiiK9h2wyqZHsZt3sFvrfu9Ja9CQ5TMh88zHhh9ApwZcEF_SqSrGE5VWVjSOC76AR4VROiCTFLhphOdwZpuoAHeb8gTFhTIh9dEAlE5JJOUJilqqm70wXUm3i-A2aNrnehjLE0K3GyY-fUwTbR9OOX99XMS2gDpCP0Z43McPJph6h-d3tfPYweXq5f5zdPE0sI0U3IdYbySnGUDrunCdYKmoV8dxI6krpBBBlLQc6VcIqKQRTRDgsmZpi6ukRul5rm76swFmou9ZE3bShMu1KJxP0304d3vUiLTXBVGLGxWA43xja9NlD7nQVsoUYTQ2pz3qqGJ4yTmkxoJdrdGEi6FD7NCjtcBxUwaYafBjeb6TkhBVc_Rqwbcq5Bb9djGD9nY5ep6M36QwDZ7-_s8V_4hiAizWQ-uY_2Rcr7pyj</recordid><startdate>20230705</startdate><enddate>20230705</enddate><creator>Kumar, Sudhir</creator><creator>Tao, Qiqing</creator><creator>Lamarca, Alessandra P</creator><creator>Tamura, Koichiro</creator><general>Oxford University Press</general><scope>TOX</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-9918-8212</orcidid></search><sort><creationdate>20230705</creationdate><title>Computational Reproducibility of Molecular Phylogenies</title><author>Kumar, Sudhir ; Tao, Qiqing ; Lamarca, Alessandra P ; Tamura, Koichiro</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c419t-1cfa75300ebd5ddf10783c81f5a73db7d6e18cc5e3286c87664816d0748203f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Genomics</topic><topic>Letter</topic><topic>Phylogeny</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kumar, Sudhir</creatorcontrib><creatorcontrib>Tao, Qiqing</creatorcontrib><creatorcontrib>Lamarca, Alessandra P</creatorcontrib><creatorcontrib>Tamura, Koichiro</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Molecular biology and evolution</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kumar, Sudhir</au><au>Tao, Qiqing</au><au>Lamarca, Alessandra P</au><au>Tamura, Koichiro</au><au>Crandall, Keith</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Computational Reproducibility of Molecular Phylogenies</atitle><jtitle>Molecular biology and evolution</jtitle><addtitle>Mol Biol Evol</addtitle><date>2023-07-05</date><risdate>2023</risdate><volume>40</volume><issue>7</issue><issn>0737-4038</issn><eissn>1537-1719</eissn><abstract>Abstract Repeated runs of the same program can generate different molecular phylogenies from identical data sets under the same analytical conditions. This lack of reproducibility of inferred phylogenies casts a long shadow on downstream research employing these phylogenies in areas such as comparative genomics, systematics, and functional biology. We have assessed the relative accuracies and log-likelihoods of alternative phylogenies generated for computer-simulated and empirical data sets. Our findings indicate that these alternative phylogenies reconstruct evolutionary relationships with comparable accuracy. They also have similar log-likelihoods that are not inferior to the log-likelihoods of the true tree. We determined that the direct relationship between irreproducibility and inaccuracy is due to their common dependence on the amount of phylogenetic information in the data. While computational reproducibility can be enhanced through more extensive heuristic searches for the maximum likelihood tree, this does not lead to higher accuracy. We conclude that computational irreproducibility plays a minor role in molecular phylogenetics.</abstract><cop>US</cop><pub>Oxford University Press</pub><pmid>37467477</pmid><doi>10.1093/molbev/msad165</doi><orcidid>https://orcid.org/0000-0002-9918-8212</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0737-4038
ispartof Molecular biology and evolution, 2023-07, Vol.40 (7)
issn 0737-4038
1537-1719
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10370456
source Oxford Journals Open Access Collection; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals; PubMed Central; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects Genomics
Letter
Phylogeny
title Computational Reproducibility of Molecular Phylogenies
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T13%3A59%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Computational%20Reproducibility%20of%20Molecular%20Phylogenies&rft.jtitle=Molecular%20biology%20and%20evolution&rft.au=Kumar,%20Sudhir&rft.date=2023-07-05&rft.volume=40&rft.issue=7&rft.issn=0737-4038&rft.eissn=1537-1719&rft_id=info:doi/10.1093/molbev/msad165&rft_dat=%3Cgale_pubme%3EA775149589%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2840245339&rft_id=info:pmid/37467477&rft_galeid=A775149589&rft_oup_id=10.1093/molbev/msad165&rfr_iscdi=true