PhyloMCL: Accurate clustering of hierarchical orthogroups guided by phylogenetic relationship and inference of polyploidy events
Identification of homology relationships is essential for inferring gene functions, detecting phylogeny of gene families, discovering evolutionary history of life, and usually, is the first step of many genetic and genomic studies. However, the presence of gene duplicates, variation on evolutionary...
Gespeichert in:
Veröffentlicht in: | Methods in ecology and evolution 2020-08, Vol.11 (8), p.943-954 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 954 |
---|---|
container_issue | 8 |
container_start_page | 943 |
container_title | Methods in ecology and evolution |
container_volume | 11 |
creator | Zhou, Shengyu Chen, Yamao Guo, Chunce Qi, Ji Johnston, Susan |
description | Identification of homology relationships is essential for inferring gene functions, detecting phylogeny of gene families, discovering evolutionary history of life, and usually, is the first step of many genetic and genomic studies. However, the presence of gene duplicates, variation on evolutionary rates of homologs, fusion and fission of genes, can lead to misidentification of evolutionary relationships among homologs.
Here we provide a Markov clustering based method called PhyloMCL to accurately detect hierarchical orthogroups (HOGs) including orthologs and paralogs, which derived from duplications subsequent to speciation of involved species, by considering both phylogenetic relationship of organisms and effects of polyploidy events.
Its performance, evaluated by a list of benchmark gene families, when applying to the clustering of HOGs from 12 Metazoan genomes, reaches up to 87.8% and 83.2% on recall and precision rates respectively. Further application of PhyloMCL on classification of tens of thousands of paralogs, yielded by multiple polyploidy events during evolution of seed plants, successfully identifies the majority of in‐/out‐paralogs at different taxonomic levels.
Benefiting from the strategy of Markov clustering and guidance of species tree, PhyloMCL can accurately classify millions of homologous genes with affordable time, meeting the challenge of phylogenomic studies upon rapid increasing of sequenced genomes. |
doi_str_mv | 10.1111/2041-210X.13401 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2429612933</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2429612933</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3571-7abaa8494fdecb2d9656dbc2dad420bb1558067f224bc5f21a50e8a3dcdfbe283</originalsourceid><addsrcrecordid>eNqFkEtLw0AUhYMoWGrXbgdcp52ZPJq4K6U-oEUXCu6GedxJpsRMnEmU7PzpJkbEnWdzL5dzzoUvCC4JXpJBK4pjElKCX5YkijE5CWa_l9M_-3mw8P6IB0VZjmk8Cz4fy76yh-3-Gm2k7BxvAcmq8y04UxfIalQacNzJ0kheIeva0hbOdo1HRWcUKCR61IwdBdTQGokcVLw1tvalaRCvFTK1Bge1hLGtsVXfVNaoHsE71K2_CM40rzwsfuY8eL7ZPW3vwv3D7f12sw9llKxJuOaC8yzOY61ACqryNEmVkFRxFVMsBEmSDKdrTWksZKIp4QmGjEdKKi2AZtE8uJp6G2ffOvAtO9rO1cNLRmOap4TmUTS4VpNLOuu9A80aZ1656xnBbCTNRpZsZMm-SQ-JdEp8mAr6_-zssNtFU_ALy_WDrQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2429612933</pqid></control><display><type>article</type><title>PhyloMCL: Accurate clustering of hierarchical orthogroups guided by phylogenetic relationship and inference of polyploidy events</title><source>Wiley Online Library Journals Frontfile Complete</source><source>Alma/SFX Local Collection</source><creator>Zhou, Shengyu ; Chen, Yamao ; Guo, Chunce ; Qi, Ji ; Johnston, Susan</creator><creatorcontrib>Zhou, Shengyu ; Chen, Yamao ; Guo, Chunce ; Qi, Ji ; Johnston, Susan</creatorcontrib><description>Identification of homology relationships is essential for inferring gene functions, detecting phylogeny of gene families, discovering evolutionary history of life, and usually, is the first step of many genetic and genomic studies. However, the presence of gene duplicates, variation on evolutionary rates of homologs, fusion and fission of genes, can lead to misidentification of evolutionary relationships among homologs.
Here we provide a Markov clustering based method called PhyloMCL to accurately detect hierarchical orthogroups (HOGs) including orthologs and paralogs, which derived from duplications subsequent to speciation of involved species, by considering both phylogenetic relationship of organisms and effects of polyploidy events.
Its performance, evaluated by a list of benchmark gene families, when applying to the clustering of HOGs from 12 Metazoan genomes, reaches up to 87.8% and 83.2% on recall and precision rates respectively. Further application of PhyloMCL on classification of tens of thousands of paralogs, yielded by multiple polyploidy events during evolution of seed plants, successfully identifies the majority of in‐/out‐paralogs at different taxonomic levels.
Benefiting from the strategy of Markov clustering and guidance of species tree, PhyloMCL can accurately classify millions of homologous genes with affordable time, meeting the challenge of phylogenomic studies upon rapid increasing of sequenced genomes.</description><identifier>ISSN: 2041-210X</identifier><identifier>EISSN: 2041-210X</identifier><identifier>DOI: 10.1111/2041-210X.13401</identifier><language>eng</language><publisher>London: John Wiley & Sons, Inc</publisher><subject>Clustering ; Evolution ; Evolutionary genetics ; gene duplication ; Gene families ; Genes ; Genomes ; hierarchical orthogroup classification ; Homology ; Markov clustering ; paralogs ; Phylogenetics ; phylogenomics ; Phylogeny ; Polyploidy ; Reproduction (copying) ; Speciation ; Swine</subject><ispartof>Methods in ecology and evolution, 2020-08, Vol.11 (8), p.943-954</ispartof><rights>2020 British Ecological Society</rights><rights>Methods in Ecology and Evolution © 2020 British Ecological Society</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3571-7abaa8494fdecb2d9656dbc2dad420bb1558067f224bc5f21a50e8a3dcdfbe283</citedby><cites>FETCH-LOGICAL-c3571-7abaa8494fdecb2d9656dbc2dad420bb1558067f224bc5f21a50e8a3dcdfbe283</cites><orcidid>0000-0003-3376-1116 ; 0000-0001-8716-7716 ; 0000-0001-7135-0524 ; 0000-0002-9472-4936</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1111%2F2041-210X.13401$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1111%2F2041-210X.13401$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,776,780,1411,27901,27902,45550,45551</link.rule.ids></links><search><creatorcontrib>Zhou, Shengyu</creatorcontrib><creatorcontrib>Chen, Yamao</creatorcontrib><creatorcontrib>Guo, Chunce</creatorcontrib><creatorcontrib>Qi, Ji</creatorcontrib><creatorcontrib>Johnston, Susan</creatorcontrib><title>PhyloMCL: Accurate clustering of hierarchical orthogroups guided by phylogenetic relationship and inference of polyploidy events</title><title>Methods in ecology and evolution</title><description>Identification of homology relationships is essential for inferring gene functions, detecting phylogeny of gene families, discovering evolutionary history of life, and usually, is the first step of many genetic and genomic studies. However, the presence of gene duplicates, variation on evolutionary rates of homologs, fusion and fission of genes, can lead to misidentification of evolutionary relationships among homologs.
Here we provide a Markov clustering based method called PhyloMCL to accurately detect hierarchical orthogroups (HOGs) including orthologs and paralogs, which derived from duplications subsequent to speciation of involved species, by considering both phylogenetic relationship of organisms and effects of polyploidy events.
Its performance, evaluated by a list of benchmark gene families, when applying to the clustering of HOGs from 12 Metazoan genomes, reaches up to 87.8% and 83.2% on recall and precision rates respectively. Further application of PhyloMCL on classification of tens of thousands of paralogs, yielded by multiple polyploidy events during evolution of seed plants, successfully identifies the majority of in‐/out‐paralogs at different taxonomic levels.
Benefiting from the strategy of Markov clustering and guidance of species tree, PhyloMCL can accurately classify millions of homologous genes with affordable time, meeting the challenge of phylogenomic studies upon rapid increasing of sequenced genomes.</description><subject>Clustering</subject><subject>Evolution</subject><subject>Evolutionary genetics</subject><subject>gene duplication</subject><subject>Gene families</subject><subject>Genes</subject><subject>Genomes</subject><subject>hierarchical orthogroup classification</subject><subject>Homology</subject><subject>Markov clustering</subject><subject>paralogs</subject><subject>Phylogenetics</subject><subject>phylogenomics</subject><subject>Phylogeny</subject><subject>Polyploidy</subject><subject>Reproduction (copying)</subject><subject>Speciation</subject><subject>Swine</subject><issn>2041-210X</issn><issn>2041-210X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNqFkEtLw0AUhYMoWGrXbgdcp52ZPJq4K6U-oEUXCu6GedxJpsRMnEmU7PzpJkbEnWdzL5dzzoUvCC4JXpJBK4pjElKCX5YkijE5CWa_l9M_-3mw8P6IB0VZjmk8Cz4fy76yh-3-Gm2k7BxvAcmq8y04UxfIalQacNzJ0kheIeva0hbOdo1HRWcUKCR61IwdBdTQGokcVLw1tvalaRCvFTK1Bge1hLGtsVXfVNaoHsE71K2_CM40rzwsfuY8eL7ZPW3vwv3D7f12sw9llKxJuOaC8yzOY61ACqryNEmVkFRxFVMsBEmSDKdrTWksZKIp4QmGjEdKKi2AZtE8uJp6G2ffOvAtO9rO1cNLRmOap4TmUTS4VpNLOuu9A80aZ1656xnBbCTNRpZsZMm-SQ-JdEp8mAr6_-zssNtFU_ALy_WDrQ</recordid><startdate>202008</startdate><enddate>202008</enddate><creator>Zhou, Shengyu</creator><creator>Chen, Yamao</creator><creator>Guo, Chunce</creator><creator>Qi, Ji</creator><creator>Johnston, Susan</creator><general>John Wiley & Sons, Inc</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7QG</scope><scope>7SN</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>P64</scope><scope>RC3</scope><orcidid>https://orcid.org/0000-0003-3376-1116</orcidid><orcidid>https://orcid.org/0000-0001-8716-7716</orcidid><orcidid>https://orcid.org/0000-0001-7135-0524</orcidid><orcidid>https://orcid.org/0000-0002-9472-4936</orcidid></search><sort><creationdate>202008</creationdate><title>PhyloMCL: Accurate clustering of hierarchical orthogroups guided by phylogenetic relationship and inference of polyploidy events</title><author>Zhou, Shengyu ; Chen, Yamao ; Guo, Chunce ; Qi, Ji ; Johnston, Susan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3571-7abaa8494fdecb2d9656dbc2dad420bb1558067f224bc5f21a50e8a3dcdfbe283</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Clustering</topic><topic>Evolution</topic><topic>Evolutionary genetics</topic><topic>gene duplication</topic><topic>Gene families</topic><topic>Genes</topic><topic>Genomes</topic><topic>hierarchical orthogroup classification</topic><topic>Homology</topic><topic>Markov clustering</topic><topic>paralogs</topic><topic>Phylogenetics</topic><topic>phylogenomics</topic><topic>Phylogeny</topic><topic>Polyploidy</topic><topic>Reproduction (copying)</topic><topic>Speciation</topic><topic>Swine</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Shengyu</creatorcontrib><creatorcontrib>Chen, Yamao</creatorcontrib><creatorcontrib>Guo, Chunce</creatorcontrib><creatorcontrib>Qi, Ji</creatorcontrib><creatorcontrib>Johnston, Susan</creatorcontrib><collection>CrossRef</collection><collection>Animal Behavior Abstracts</collection><collection>Ecology Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><jtitle>Methods in ecology and evolution</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Shengyu</au><au>Chen, Yamao</au><au>Guo, Chunce</au><au>Qi, Ji</au><au>Johnston, Susan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PhyloMCL: Accurate clustering of hierarchical orthogroups guided by phylogenetic relationship and inference of polyploidy events</atitle><jtitle>Methods in ecology and evolution</jtitle><date>2020-08</date><risdate>2020</risdate><volume>11</volume><issue>8</issue><spage>943</spage><epage>954</epage><pages>943-954</pages><issn>2041-210X</issn><eissn>2041-210X</eissn><abstract>Identification of homology relationships is essential for inferring gene functions, detecting phylogeny of gene families, discovering evolutionary history of life, and usually, is the first step of many genetic and genomic studies. However, the presence of gene duplicates, variation on evolutionary rates of homologs, fusion and fission of genes, can lead to misidentification of evolutionary relationships among homologs.
Here we provide a Markov clustering based method called PhyloMCL to accurately detect hierarchical orthogroups (HOGs) including orthologs and paralogs, which derived from duplications subsequent to speciation of involved species, by considering both phylogenetic relationship of organisms and effects of polyploidy events.
Its performance, evaluated by a list of benchmark gene families, when applying to the clustering of HOGs from 12 Metazoan genomes, reaches up to 87.8% and 83.2% on recall and precision rates respectively. Further application of PhyloMCL on classification of tens of thousands of paralogs, yielded by multiple polyploidy events during evolution of seed plants, successfully identifies the majority of in‐/out‐paralogs at different taxonomic levels.
Benefiting from the strategy of Markov clustering and guidance of species tree, PhyloMCL can accurately classify millions of homologous genes with affordable time, meeting the challenge of phylogenomic studies upon rapid increasing of sequenced genomes.</abstract><cop>London</cop><pub>John Wiley & Sons, Inc</pub><doi>10.1111/2041-210X.13401</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0003-3376-1116</orcidid><orcidid>https://orcid.org/0000-0001-8716-7716</orcidid><orcidid>https://orcid.org/0000-0001-7135-0524</orcidid><orcidid>https://orcid.org/0000-0002-9472-4936</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2041-210X |
ispartof | Methods in ecology and evolution, 2020-08, Vol.11 (8), p.943-954 |
issn | 2041-210X 2041-210X |
language | eng |
recordid | cdi_proquest_journals_2429612933 |
source | Wiley Online Library Journals Frontfile Complete; Alma/SFX Local Collection |
subjects | Clustering Evolution Evolutionary genetics gene duplication Gene families Genes Genomes hierarchical orthogroup classification Homology Markov clustering paralogs Phylogenetics phylogenomics Phylogeny Polyploidy Reproduction (copying) Speciation Swine |
title | PhyloMCL: Accurate clustering of hierarchical orthogroups guided by phylogenetic relationship and inference of polyploidy events |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T22%3A17%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PhyloMCL:%20Accurate%20clustering%20of%20hierarchical%20orthogroups%20guided%20by%20phylogenetic%20relationship%20and%20inference%20of%20polyploidy%20events&rft.jtitle=Methods%20in%20ecology%20and%20evolution&rft.au=Zhou,%20Shengyu&rft.date=2020-08&rft.volume=11&rft.issue=8&rft.spage=943&rft.epage=954&rft.pages=943-954&rft.issn=2041-210X&rft.eissn=2041-210X&rft_id=info:doi/10.1111/2041-210X.13401&rft_dat=%3Cproquest_cross%3E2429612933%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2429612933&rft_id=info:pmid/&rfr_iscdi=true |