Benchmarking network-based gene prioritization methods for cerebral small vessel disease

Network-based gene prioritization algorithms are designed to prioritize disease-associated genes based on known ones using biological networks of protein interactions, gene-disease associations (GDAs) and other relationships between biological entities. Various algorithms have been developed based o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Briefings in bioinformatics 2021-09, Vol.22 (5)
Hauptverfasser: Zhang, Huayu, Ferguson, Amy, Robertson, Grant, Jiang, Muchen, Zhang, Teng, Sudlow, Cathie, Smith, Keith, Rannikmae, Kristiina, Wu, Honghan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 5
container_start_page
container_title Briefings in bioinformatics
container_volume 22
creator Zhang, Huayu
Ferguson, Amy
Robertson, Grant
Jiang, Muchen
Zhang, Teng
Sudlow, Cathie
Smith, Keith
Rannikmae, Kristiina
Wu, Honghan
description Network-based gene prioritization algorithms are designed to prioritize disease-associated genes based on known ones using biological networks of protein interactions, gene-disease associations (GDAs) and other relationships between biological entities. Various algorithms have been developed based on different mechanisms, but it is not obvious which algorithm is optimal for a specific disease. To address this issue, we benchmarked multiple algorithms for their application in cerebral small vessel disease (cSVD). We curated protein-gene interactions (PGIs) and GDAs from databases and assembled PGI networks and disease-gene heterogeneous networks. A screening of algorithms resulted in seven representative algorithms to be benchmarked. Performance of algorithms was assessed using both leave-one-out cross-validation (LOOCV) and external validation with MEGASTROKE genome-wide association study (GWAS). We found that random walk with restart on the heterogeneous network (RWRH) showed best LOOCV performance, with median LOOCV rediscovery rank of 185.5 (out of 19 463 genes). The GenePanda algorithm had most GWAS-confirmable genes in top 200 predictions, while RWRH had best ranks for small vessel stroke-associated genes confirmed in GWAS. In conclusion, RWRH has overall better performance for application in cSVD despite its susceptibility to bias caused by degree centrality. Choice of algorithms should be determined before applying to specific disease. Current pure network-based gene prioritization algorithms are unlikely to find novel disease-associated genes that are not associated with known ones. The tools for implementing and benchmarking algorithms have been made available and can be generalized for other diseases.
doi_str_mv 10.1093/bib/bbab006
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8425308</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2494297227</sourcerecordid><originalsourceid>FETCH-LOGICAL-c381t-da04432eb4267e4d057a26c33f03eda42f3cd1bc77ee3489b6af3492178238f53</originalsourceid><addsrcrecordid>eNpVkc1LxDAQxYMofp-8S46CVJNM2rQXQcUvELwoeAtJOt2Nto0m3RX96-2yq-gpA_nNmzfzCDng7ISzCk6tt6fWGstYsUa2uVQqkyyX64u6UFkuC9giOym9MCaYKvkm2QIoQAIX2-T5Ans37Ux89f2E9jh8hPiaWZOwphPskb5FH6If_JcZfOhph8M01Ik2IVKHEW00LU2daVs6x5SwpbVPOLbvkY3GtAn3V-8uebq-ery8ze4fbu4uz-8zByUfstowKUGglaJQKGuWKyMKB9AwwNpI0YCruXVKIYIsK1uYBmQluCoFlE0Ou-Rsqfs2sx3WDvthtKRH2-NSnzoYr___9H6qJ2GuSylyYOUocLQSiOF9hmnQnU8O29b0GGZJC1lJUSkh1IgeL1EXQ0oRm98xnOlFFnrMQq-yGOnDv85-2Z_jwzeMcYh4</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2494297227</pqid></control><display><type>article</type><title>Benchmarking network-based gene prioritization methods for cerebral small vessel disease</title><source>MEDLINE</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Business Source Complete</source><source>Oxford Journals Open Access Collection</source><source>PubMed Central</source><creator>Zhang, Huayu ; Ferguson, Amy ; Robertson, Grant ; Jiang, Muchen ; Zhang, Teng ; Sudlow, Cathie ; Smith, Keith ; Rannikmae, Kristiina ; Wu, Honghan</creator><creatorcontrib>Zhang, Huayu ; Ferguson, Amy ; Robertson, Grant ; Jiang, Muchen ; Zhang, Teng ; Sudlow, Cathie ; Smith, Keith ; Rannikmae, Kristiina ; Wu, Honghan</creatorcontrib><description>Network-based gene prioritization algorithms are designed to prioritize disease-associated genes based on known ones using biological networks of protein interactions, gene-disease associations (GDAs) and other relationships between biological entities. Various algorithms have been developed based on different mechanisms, but it is not obvious which algorithm is optimal for a specific disease. To address this issue, we benchmarked multiple algorithms for their application in cerebral small vessel disease (cSVD). We curated protein-gene interactions (PGIs) and GDAs from databases and assembled PGI networks and disease-gene heterogeneous networks. A screening of algorithms resulted in seven representative algorithms to be benchmarked. Performance of algorithms was assessed using both leave-one-out cross-validation (LOOCV) and external validation with MEGASTROKE genome-wide association study (GWAS). We found that random walk with restart on the heterogeneous network (RWRH) showed best LOOCV performance, with median LOOCV rediscovery rank of 185.5 (out of 19 463 genes). The GenePanda algorithm had most GWAS-confirmable genes in top 200 predictions, while RWRH had best ranks for small vessel stroke-associated genes confirmed in GWAS. In conclusion, RWRH has overall better performance for application in cSVD despite its susceptibility to bias caused by degree centrality. Choice of algorithms should be determined before applying to specific disease. Current pure network-based gene prioritization algorithms are unlikely to find novel disease-associated genes that are not associated with known ones. The tools for implementing and benchmarking algorithms have been made available and can be generalized for other diseases.</description><identifier>ISSN: 1467-5463</identifier><identifier>EISSN: 1477-4054</identifier><identifier>DOI: 10.1093/bib/bbab006</identifier><identifier>PMID: 33634312</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Algorithms ; Benchmarking - methods ; Cerebral Small Vessel Diseases - genetics ; Computational Biology - methods ; Gene Regulatory Networks ; Genome-Wide Association Study ; Humans ; Multigene Family ; Phenotype ; Problem Solving Protocol ; Protein Interaction Maps - genetics ; Risk Factors</subject><ispartof>Briefings in bioinformatics, 2021-09, Vol.22 (5)</ispartof><rights>The Author(s) 2021. Published by Oxford University Press.</rights><rights>The Author(s) 2021. Published by Oxford University Press. 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c381t-da04432eb4267e4d057a26c33f03eda42f3cd1bc77ee3489b6af3492178238f53</citedby><cites>FETCH-LOGICAL-c381t-da04432eb4267e4d057a26c33f03eda42f3cd1bc77ee3489b6af3492178238f53</cites><orcidid>0000-0003-1625-4067 ; 0000-0002-5310-8766</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8425308/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8425308/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33634312$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Huayu</creatorcontrib><creatorcontrib>Ferguson, Amy</creatorcontrib><creatorcontrib>Robertson, Grant</creatorcontrib><creatorcontrib>Jiang, Muchen</creatorcontrib><creatorcontrib>Zhang, Teng</creatorcontrib><creatorcontrib>Sudlow, Cathie</creatorcontrib><creatorcontrib>Smith, Keith</creatorcontrib><creatorcontrib>Rannikmae, Kristiina</creatorcontrib><creatorcontrib>Wu, Honghan</creatorcontrib><title>Benchmarking network-based gene prioritization methods for cerebral small vessel disease</title><title>Briefings in bioinformatics</title><addtitle>Brief Bioinform</addtitle><description>Network-based gene prioritization algorithms are designed to prioritize disease-associated genes based on known ones using biological networks of protein interactions, gene-disease associations (GDAs) and other relationships between biological entities. Various algorithms have been developed based on different mechanisms, but it is not obvious which algorithm is optimal for a specific disease. To address this issue, we benchmarked multiple algorithms for their application in cerebral small vessel disease (cSVD). We curated protein-gene interactions (PGIs) and GDAs from databases and assembled PGI networks and disease-gene heterogeneous networks. A screening of algorithms resulted in seven representative algorithms to be benchmarked. Performance of algorithms was assessed using both leave-one-out cross-validation (LOOCV) and external validation with MEGASTROKE genome-wide association study (GWAS). We found that random walk with restart on the heterogeneous network (RWRH) showed best LOOCV performance, with median LOOCV rediscovery rank of 185.5 (out of 19 463 genes). The GenePanda algorithm had most GWAS-confirmable genes in top 200 predictions, while RWRH had best ranks for small vessel stroke-associated genes confirmed in GWAS. In conclusion, RWRH has overall better performance for application in cSVD despite its susceptibility to bias caused by degree centrality. Choice of algorithms should be determined before applying to specific disease. Current pure network-based gene prioritization algorithms are unlikely to find novel disease-associated genes that are not associated with known ones. The tools for implementing and benchmarking algorithms have been made available and can be generalized for other diseases.</description><subject>Algorithms</subject><subject>Benchmarking - methods</subject><subject>Cerebral Small Vessel Diseases - genetics</subject><subject>Computational Biology - methods</subject><subject>Gene Regulatory Networks</subject><subject>Genome-Wide Association Study</subject><subject>Humans</subject><subject>Multigene Family</subject><subject>Phenotype</subject><subject>Problem Solving Protocol</subject><subject>Protein Interaction Maps - genetics</subject><subject>Risk Factors</subject><issn>1467-5463</issn><issn>1477-4054</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVkc1LxDAQxYMofp-8S46CVJNM2rQXQcUvELwoeAtJOt2Nto0m3RX96-2yq-gpA_nNmzfzCDng7ISzCk6tt6fWGstYsUa2uVQqkyyX64u6UFkuC9giOym9MCaYKvkm2QIoQAIX2-T5Ans37Ux89f2E9jh8hPiaWZOwphPskb5FH6If_JcZfOhph8M01Ik2IVKHEW00LU2daVs6x5SwpbVPOLbvkY3GtAn3V-8uebq-ery8ze4fbu4uz-8zByUfstowKUGglaJQKGuWKyMKB9AwwNpI0YCruXVKIYIsK1uYBmQluCoFlE0Ou-Rsqfs2sx3WDvthtKRH2-NSnzoYr___9H6qJ2GuSylyYOUocLQSiOF9hmnQnU8O29b0GGZJC1lJUSkh1IgeL1EXQ0oRm98xnOlFFnrMQq-yGOnDv85-2Z_jwzeMcYh4</recordid><startdate>20210902</startdate><enddate>20210902</enddate><creator>Zhang, Huayu</creator><creator>Ferguson, Amy</creator><creator>Robertson, Grant</creator><creator>Jiang, Muchen</creator><creator>Zhang, Teng</creator><creator>Sudlow, Cathie</creator><creator>Smith, Keith</creator><creator>Rannikmae, Kristiina</creator><creator>Wu, Honghan</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-1625-4067</orcidid><orcidid>https://orcid.org/0000-0002-5310-8766</orcidid></search><sort><creationdate>20210902</creationdate><title>Benchmarking network-based gene prioritization methods for cerebral small vessel disease</title><author>Zhang, Huayu ; Ferguson, Amy ; Robertson, Grant ; Jiang, Muchen ; Zhang, Teng ; Sudlow, Cathie ; Smith, Keith ; Rannikmae, Kristiina ; Wu, Honghan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c381t-da04432eb4267e4d057a26c33f03eda42f3cd1bc77ee3489b6af3492178238f53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Benchmarking - methods</topic><topic>Cerebral Small Vessel Diseases - genetics</topic><topic>Computational Biology - methods</topic><topic>Gene Regulatory Networks</topic><topic>Genome-Wide Association Study</topic><topic>Humans</topic><topic>Multigene Family</topic><topic>Phenotype</topic><topic>Problem Solving Protocol</topic><topic>Protein Interaction Maps - genetics</topic><topic>Risk Factors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Huayu</creatorcontrib><creatorcontrib>Ferguson, Amy</creatorcontrib><creatorcontrib>Robertson, Grant</creatorcontrib><creatorcontrib>Jiang, Muchen</creatorcontrib><creatorcontrib>Zhang, Teng</creatorcontrib><creatorcontrib>Sudlow, Cathie</creatorcontrib><creatorcontrib>Smith, Keith</creatorcontrib><creatorcontrib>Rannikmae, Kristiina</creatorcontrib><creatorcontrib>Wu, Honghan</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Briefings in bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Huayu</au><au>Ferguson, Amy</au><au>Robertson, Grant</au><au>Jiang, Muchen</au><au>Zhang, Teng</au><au>Sudlow, Cathie</au><au>Smith, Keith</au><au>Rannikmae, Kristiina</au><au>Wu, Honghan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Benchmarking network-based gene prioritization methods for cerebral small vessel disease</atitle><jtitle>Briefings in bioinformatics</jtitle><addtitle>Brief Bioinform</addtitle><date>2021-09-02</date><risdate>2021</risdate><volume>22</volume><issue>5</issue><issn>1467-5463</issn><eissn>1477-4054</eissn><abstract>Network-based gene prioritization algorithms are designed to prioritize disease-associated genes based on known ones using biological networks of protein interactions, gene-disease associations (GDAs) and other relationships between biological entities. Various algorithms have been developed based on different mechanisms, but it is not obvious which algorithm is optimal for a specific disease. To address this issue, we benchmarked multiple algorithms for their application in cerebral small vessel disease (cSVD). We curated protein-gene interactions (PGIs) and GDAs from databases and assembled PGI networks and disease-gene heterogeneous networks. A screening of algorithms resulted in seven representative algorithms to be benchmarked. Performance of algorithms was assessed using both leave-one-out cross-validation (LOOCV) and external validation with MEGASTROKE genome-wide association study (GWAS). We found that random walk with restart on the heterogeneous network (RWRH) showed best LOOCV performance, with median LOOCV rediscovery rank of 185.5 (out of 19 463 genes). The GenePanda algorithm had most GWAS-confirmable genes in top 200 predictions, while RWRH had best ranks for small vessel stroke-associated genes confirmed in GWAS. In conclusion, RWRH has overall better performance for application in cSVD despite its susceptibility to bias caused by degree centrality. Choice of algorithms should be determined before applying to specific disease. Current pure network-based gene prioritization algorithms are unlikely to find novel disease-associated genes that are not associated with known ones. The tools for implementing and benchmarking algorithms have been made available and can be generalized for other diseases.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>33634312</pmid><doi>10.1093/bib/bbab006</doi><orcidid>https://orcid.org/0000-0003-1625-4067</orcidid><orcidid>https://orcid.org/0000-0002-5310-8766</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1467-5463
ispartof Briefings in bioinformatics, 2021-09, Vol.22 (5)
issn 1467-5463
1477-4054
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8425308
source MEDLINE; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Business Source Complete; Oxford Journals Open Access Collection; PubMed Central
subjects Algorithms
Benchmarking - methods
Cerebral Small Vessel Diseases - genetics
Computational Biology - methods
Gene Regulatory Networks
Genome-Wide Association Study
Humans
Multigene Family
Phenotype
Problem Solving Protocol
Protein Interaction Maps - genetics
Risk Factors
title Benchmarking network-based gene prioritization methods for cerebral small vessel disease
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T06%3A44%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Benchmarking%20network-based%20gene%20prioritization%20methods%20for%20cerebral%20small%20vessel%20disease&rft.jtitle=Briefings%20in%20bioinformatics&rft.au=Zhang,%20Huayu&rft.date=2021-09-02&rft.volume=22&rft.issue=5&rft.issn=1467-5463&rft.eissn=1477-4054&rft_id=info:doi/10.1093/bib/bbab006&rft_dat=%3Cproquest_pubme%3E2494297227%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2494297227&rft_id=info:pmid/33634312&rfr_iscdi=true