Deformity Index: A Semi-Reference Clade-Based Quality Metric of Phylogenetic Trees

Measuring the dissimilarity of a phylogenetic tree with respect to a reference tree or the hypotheses is a fundamental task in the phylogenetic study. A large number of methods have been proposed to compute the distance between the reference tree and the target tree. Due to the presence of unresolve...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of molecular evolution 2021-06, Vol.89 (4-5), p.302-312
Hauptverfasser: Mahapatra, Aritra, Mukherjee, Jayanta
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 312
container_issue 4-5
container_start_page 302
container_title Journal of molecular evolution
container_volume 89
creator Mahapatra, Aritra
Mukherjee, Jayanta
description Measuring the dissimilarity of a phylogenetic tree with respect to a reference tree or the hypotheses is a fundamental task in the phylogenetic study. A large number of methods have been proposed to compute the distance between the reference tree and the target tree. Due to the presence of unresolved relationships among the species, it is challenging to obtain a precise and an accurate reference tree for a selected dataset. As a result, the existing tree comparison methods may behave unexpectedly in various scenarios. In this paper, we introduce a novel scoring function, called the deformity index , to quantify the dissimilarity of a tree based on the list of clades of a reference tree. The strength of our proposed method is that it depends on the list of clades that can be acquired either from the reference tree or from the hypotheses. We investigate the distributions of different modules of the deformity index and perform different goodness-of-fit tests to understand the cumulative distribution. Then, we examine, in detail, the robustness as well as the scalability of our measure by performing different statistical tests under various models. Finally, we experiment on different biological datasets and show that our proposed scoring function overcomes the limitations of the conventional methods.
doi_str_mv 10.1007/s00239-021-10006-4
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2508578750</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2508578750</sourcerecordid><originalsourceid>FETCH-LOGICAL-c370t-9ff359c2c6da9ec6c04444b21060d504d8a1c6e1dae7f9c50a2f1bdda82ca2973</originalsourceid><addsrcrecordid>eNp9kE1PGzEQhi0EgpD2D3BAK3Hpxe3YXu-uuUFaWiQQLQ1ny7HHYaP9CPau1Px7nIYWiQM-2BrNM-9YDyEnDD4zgPJLBOBCUeCMphoKmu-RCcsFp9trn0xSn1Ne5fkROY5xBcBKqcQhORKiYkwCm5D7r-j70NbDJrvuHP45zy6y39jW9B49BuwsZrPGOKSXJqLLfo2m2bK3OITaZr3Pfj5umn6JHQ6pngfE-IEceNNE_PjyTsnD1bf57Ae9uft-Pbu4oVaUMFDlvZDKcls4o9AWFvJ0FpxBAU5C7irDbIHMGSy9shIM92zhnKm4NVyVYko-7XLXoX8aMQ66raPFpjEd9mPUXEIly6qUkNCzN-iqH0OXfpcoLhWwXG0pvqNs6GMM6PU61K0JG81Ab43rnXGdjOu_xnWehk5fosdFi-7_yD_FCRA7IKZWt8Twuvud2GfVhYoK</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2525901490</pqid></control><display><type>article</type><title>Deformity Index: A Semi-Reference Clade-Based Quality Metric of Phylogenetic Trees</title><source>SpringerLink Journals - AutoHoldings</source><creator>Mahapatra, Aritra ; Mukherjee, Jayanta</creator><creatorcontrib>Mahapatra, Aritra ; Mukherjee, Jayanta</creatorcontrib><description>Measuring the dissimilarity of a phylogenetic tree with respect to a reference tree or the hypotheses is a fundamental task in the phylogenetic study. A large number of methods have been proposed to compute the distance between the reference tree and the target tree. Due to the presence of unresolved relationships among the species, it is challenging to obtain a precise and an accurate reference tree for a selected dataset. As a result, the existing tree comparison methods may behave unexpectedly in various scenarios. In this paper, we introduce a novel scoring function, called the deformity index , to quantify the dissimilarity of a tree based on the list of clades of a reference tree. The strength of our proposed method is that it depends on the list of clades that can be acquired either from the reference tree or from the hypotheses. We investigate the distributions of different modules of the deformity index and perform different goodness-of-fit tests to understand the cumulative distribution. Then, we examine, in detail, the robustness as well as the scalability of our measure by performing different statistical tests under various models. Finally, we experiment on different biological datasets and show that our proposed scoring function overcomes the limitations of the conventional methods.</description><identifier>ISSN: 0022-2844</identifier><identifier>EISSN: 1432-1432</identifier><identifier>DOI: 10.1007/s00239-021-10006-4</identifier><identifier>PMID: 33811501</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Animal Genetics and Genomics ; Biomedical and Life Sciences ; Cell Biology ; Datasets ; Evolutionary Biology ; Goodness of fit ; Hypotheses ; Life Sciences ; Mathematical models ; Microbiology ; Original Article ; Phylogenetics ; Phylogeny ; Plant Genetics and Genomics ; Plant Sciences ; Statistical analysis ; Statistical tests</subject><ispartof>Journal of molecular evolution, 2021-06, Vol.89 (4-5), p.302-312</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c370t-9ff359c2c6da9ec6c04444b21060d504d8a1c6e1dae7f9c50a2f1bdda82ca2973</cites><orcidid>0000-0003-0078-8590</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00239-021-10006-4$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00239-021-10006-4$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33811501$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Mahapatra, Aritra</creatorcontrib><creatorcontrib>Mukherjee, Jayanta</creatorcontrib><title>Deformity Index: A Semi-Reference Clade-Based Quality Metric of Phylogenetic Trees</title><title>Journal of molecular evolution</title><addtitle>J Mol Evol</addtitle><addtitle>J Mol Evol</addtitle><description>Measuring the dissimilarity of a phylogenetic tree with respect to a reference tree or the hypotheses is a fundamental task in the phylogenetic study. A large number of methods have been proposed to compute the distance between the reference tree and the target tree. Due to the presence of unresolved relationships among the species, it is challenging to obtain a precise and an accurate reference tree for a selected dataset. As a result, the existing tree comparison methods may behave unexpectedly in various scenarios. In this paper, we introduce a novel scoring function, called the deformity index , to quantify the dissimilarity of a tree based on the list of clades of a reference tree. The strength of our proposed method is that it depends on the list of clades that can be acquired either from the reference tree or from the hypotheses. We investigate the distributions of different modules of the deformity index and perform different goodness-of-fit tests to understand the cumulative distribution. Then, we examine, in detail, the robustness as well as the scalability of our measure by performing different statistical tests under various models. Finally, we experiment on different biological datasets and show that our proposed scoring function overcomes the limitations of the conventional methods.</description><subject>Animal Genetics and Genomics</subject><subject>Biomedical and Life Sciences</subject><subject>Cell Biology</subject><subject>Datasets</subject><subject>Evolutionary Biology</subject><subject>Goodness of fit</subject><subject>Hypotheses</subject><subject>Life Sciences</subject><subject>Mathematical models</subject><subject>Microbiology</subject><subject>Original Article</subject><subject>Phylogenetics</subject><subject>Phylogeny</subject><subject>Plant Genetics and Genomics</subject><subject>Plant Sciences</subject><subject>Statistical analysis</subject><subject>Statistical tests</subject><issn>0022-2844</issn><issn>1432-1432</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>BENPR</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp9kE1PGzEQhi0EgpD2D3BAK3Hpxe3YXu-uuUFaWiQQLQ1ny7HHYaP9CPau1Px7nIYWiQM-2BrNM-9YDyEnDD4zgPJLBOBCUeCMphoKmu-RCcsFp9trn0xSn1Ne5fkROY5xBcBKqcQhORKiYkwCm5D7r-j70NbDJrvuHP45zy6y39jW9B49BuwsZrPGOKSXJqLLfo2m2bK3OITaZr3Pfj5umn6JHQ6pngfE-IEceNNE_PjyTsnD1bf57Ae9uft-Pbu4oVaUMFDlvZDKcls4o9AWFvJ0FpxBAU5C7irDbIHMGSy9shIM92zhnKm4NVyVYko-7XLXoX8aMQ66raPFpjEd9mPUXEIly6qUkNCzN-iqH0OXfpcoLhWwXG0pvqNs6GMM6PU61K0JG81Ab43rnXGdjOu_xnWehk5fosdFi-7_yD_FCRA7IKZWt8Twuvud2GfVhYoK</recordid><startdate>20210601</startdate><enddate>20210601</enddate><creator>Mahapatra, Aritra</creator><creator>Mukherjee, Jayanta</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QL</scope><scope>7QP</scope><scope>7QR</scope><scope>7T7</scope><scope>7TK</scope><scope>7U9</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2O</scope><scope>M7N</scope><scope>M7P</scope><scope>MBDVC</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-0078-8590</orcidid></search><sort><creationdate>20210601</creationdate><title>Deformity Index: A Semi-Reference Clade-Based Quality Metric of Phylogenetic Trees</title><author>Mahapatra, Aritra ; Mukherjee, Jayanta</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c370t-9ff359c2c6da9ec6c04444b21060d504d8a1c6e1dae7f9c50a2f1bdda82ca2973</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Animal Genetics and Genomics</topic><topic>Biomedical and Life Sciences</topic><topic>Cell Biology</topic><topic>Datasets</topic><topic>Evolutionary Biology</topic><topic>Goodness of fit</topic><topic>Hypotheses</topic><topic>Life Sciences</topic><topic>Mathematical models</topic><topic>Microbiology</topic><topic>Original Article</topic><topic>Phylogenetics</topic><topic>Phylogeny</topic><topic>Plant Genetics and Genomics</topic><topic>Plant Sciences</topic><topic>Statistical analysis</topic><topic>Statistical tests</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mahapatra, Aritra</creatorcontrib><creatorcontrib>Mukherjee, Jayanta</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Neurosciences Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Research Library</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Research Library (Corporate)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of molecular evolution</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mahapatra, Aritra</au><au>Mukherjee, Jayanta</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deformity Index: A Semi-Reference Clade-Based Quality Metric of Phylogenetic Trees</atitle><jtitle>Journal of molecular evolution</jtitle><stitle>J Mol Evol</stitle><addtitle>J Mol Evol</addtitle><date>2021-06-01</date><risdate>2021</risdate><volume>89</volume><issue>4-5</issue><spage>302</spage><epage>312</epage><pages>302-312</pages><issn>0022-2844</issn><eissn>1432-1432</eissn><abstract>Measuring the dissimilarity of a phylogenetic tree with respect to a reference tree or the hypotheses is a fundamental task in the phylogenetic study. A large number of methods have been proposed to compute the distance between the reference tree and the target tree. Due to the presence of unresolved relationships among the species, it is challenging to obtain a precise and an accurate reference tree for a selected dataset. As a result, the existing tree comparison methods may behave unexpectedly in various scenarios. In this paper, we introduce a novel scoring function, called the deformity index , to quantify the dissimilarity of a tree based on the list of clades of a reference tree. The strength of our proposed method is that it depends on the list of clades that can be acquired either from the reference tree or from the hypotheses. We investigate the distributions of different modules of the deformity index and perform different goodness-of-fit tests to understand the cumulative distribution. Then, we examine, in detail, the robustness as well as the scalability of our measure by performing different statistical tests under various models. Finally, we experiment on different biological datasets and show that our proposed scoring function overcomes the limitations of the conventional methods.</abstract><cop>New York</cop><pub>Springer US</pub><pmid>33811501</pmid><doi>10.1007/s00239-021-10006-4</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0003-0078-8590</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0022-2844
ispartof Journal of molecular evolution, 2021-06, Vol.89 (4-5), p.302-312
issn 0022-2844
1432-1432
language eng
recordid cdi_proquest_miscellaneous_2508578750
source SpringerLink Journals - AutoHoldings
subjects Animal Genetics and Genomics
Biomedical and Life Sciences
Cell Biology
Datasets
Evolutionary Biology
Goodness of fit
Hypotheses
Life Sciences
Mathematical models
Microbiology
Original Article
Phylogenetics
Phylogeny
Plant Genetics and Genomics
Plant Sciences
Statistical analysis
Statistical tests
title Deformity Index: A Semi-Reference Clade-Based Quality Metric of Phylogenetic Trees
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T20%3A21%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deformity%20Index:%20A%20Semi-Reference%20Clade-Based%20Quality%20Metric%20of%20Phylogenetic%20Trees&rft.jtitle=Journal%20of%20molecular%20evolution&rft.au=Mahapatra,%20Aritra&rft.date=2021-06-01&rft.volume=89&rft.issue=4-5&rft.spage=302&rft.epage=312&rft.pages=302-312&rft.issn=0022-2844&rft.eissn=1432-1432&rft_id=info:doi/10.1007/s00239-021-10006-4&rft_dat=%3Cproquest_cross%3E2508578750%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2525901490&rft_id=info:pmid/33811501&rfr_iscdi=true