NetSHy: network summarization via a hybrid approach leveraging topological properties
Abstract Motivation Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separat...
Gespeichert in:
Veröffentlicht in: | Bioinformatics (Oxford, England) England), 2023-01, Vol.39 (1) |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 1 |
container_start_page | |
container_title | Bioinformatics (Oxford, England) |
container_volume | 39 |
creator | Vu, Thao Litkowski, Elizabeth M Liu, Weixuan Pratte, Katherine A Lange, Leslie Bowler, Russell P Banaei-Kashani, Farnoush Kechris, Katerina J |
description | Abstract
Motivation
Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separate functions. In order to perform subsequent analyses to investigate the association between the identified module and a variable of interest, a module summarization, that best explains the module’s information and reduces dimensionality is often needed. Conventional approaches for obtaining network representation typically rely only on the profiles of the nodes within the network while disregarding the inherent network topological information.
Results
In this article, we propose NetSHy, a hybrid approach which is capable of reducing the dimension of a network while incorporating topological properties to aid the interpretation of the downstream analyses. In particular, NetSHy applies principal component analysis (PCA) on a combination of the node profiles and the well-known Laplacian matrix derived directly from the network similarity matrix to extract a summarization at a subject level. Simulation scenarios based on random and empirical networks at varying network sizes and sparsity levels show that NetSHy outperforms the conventional PCA approach applied directly on node profiles, in terms of recovering the true correlation with a phenotype of interest and maintaining a higher amount of explained variation in the data when networks are relatively sparse. The robustness of NetSHy is also demonstrated by a more consistent correlation with the observed phenotype as the sample size decreases. Lastly, a genome-wide association study is performed as an application of a downstream analysis, where NetSHy summarization scores on the biological networks identify more significant single nucleotide polymorphisms than the conventional network representation.
Availability and implementation
R code implementation of NetSHy is available at https://github.com/thaovu1/NetSHy
Supplementary information
Supplementary data are available at Bioinformatics online. |
doi_str_mv | 10.1093/bioinformatics/btac818 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9831052</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btac818</oup_id><sourcerecordid>2757056484</sourcerecordid><originalsourceid>FETCH-LOGICAL-c431t-2def96e0a961e78809d4933e6e977af99297270d5313b0e732f2b2dc2fddd07b3</originalsourceid><addsrcrecordid>eNqNkU1P3DAQhq0KVOi2fwFZ6qWX7fojieMekKoVhUoIDpSz5diTXdMkDrazaPvrMdoFQU-cbGmeeTQzL0InlHynRPJF47wbWh96nZyJiyZpU9P6AzqmvBLzoqb04NX_CH2K8Y4QUpKy-oiOeFUWNS_oMbq9gnRzsf2BB0gPPvzFcep7Hdy_7PUD3jiNNV5vm-As1uMYvDZr3MEGgl65YYWTH33nV87oDufqCCE5iJ_RYau7CF_27wzd_jr7s7yYX16f_17-vJybgtM0ZxZaWQHRsqIg6ppIW0jOoQIphG6lZFIwQWzJKW8ICM5a1jBrWGutJaLhM3S6845T04M1MKSgOzUGl3fYKq-delsZ3Fqt_EbJmlNSsiz4thcEfz9BTKp30UDX6QH8FBUTpcgnK-oio1__Q-_8FIa8nsrj8SxjBc9UtaNM8DEGaF-GoUQ9JafeJqf2yeXGk9ervLQ9R5UBugP8NL5X-ghJN65L</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3133523243</pqid></control><display><type>article</type><title>NetSHy: network summarization via a hybrid approach leveraging topological properties</title><source>Oxford Journals Open Access Collection</source><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Vu, Thao ; Litkowski, Elizabeth M ; Liu, Weixuan ; Pratte, Katherine A ; Lange, Leslie ; Bowler, Russell P ; Banaei-Kashani, Farnoush ; Kechris, Katerina J</creator><contributor>Martelli, Pier Luigi</contributor><creatorcontrib>Vu, Thao ; Litkowski, Elizabeth M ; Liu, Weixuan ; Pratte, Katherine A ; Lange, Leslie ; Bowler, Russell P ; Banaei-Kashani, Farnoush ; Kechris, Katerina J ; Martelli, Pier Luigi</creatorcontrib><description>Abstract
Motivation
Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separate functions. In order to perform subsequent analyses to investigate the association between the identified module and a variable of interest, a module summarization, that best explains the module’s information and reduces dimensionality is often needed. Conventional approaches for obtaining network representation typically rely only on the profiles of the nodes within the network while disregarding the inherent network topological information.
Results
In this article, we propose NetSHy, a hybrid approach which is capable of reducing the dimension of a network while incorporating topological properties to aid the interpretation of the downstream analyses. In particular, NetSHy applies principal component analysis (PCA) on a combination of the node profiles and the well-known Laplacian matrix derived directly from the network similarity matrix to extract a summarization at a subject level. Simulation scenarios based on random and empirical networks at varying network sizes and sparsity levels show that NetSHy outperforms the conventional PCA approach applied directly on node profiles, in terms of recovering the true correlation with a phenotype of interest and maintaining a higher amount of explained variation in the data when networks are relatively sparse. The robustness of NetSHy is also demonstrated by a more consistent correlation with the observed phenotype as the sample size decreases. Lastly, a genome-wide association study is performed as an application of a downstream analysis, where NetSHy summarization scores on the biological networks identify more significant single nucleotide polymorphisms than the conventional network representation.
Availability and implementation
R code implementation of NetSHy is available at https://github.com/thaovu1/NetSHy
Supplementary information
Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4811</identifier><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btac818</identifier><identifier>PMID: 36548341</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Availability ; Bioinformatics ; Biological activity ; Biological properties ; Computer Simulation ; Genome-wide association studies ; Genome-Wide Association Study ; Genomic analysis ; Modularity ; Modules ; Networks ; Nodes ; Nucleotides ; Original Paper ; Phenotypes ; Phenotypic variations ; Polymorphism, Single Nucleotide ; Principal Component Analysis ; Principal components analysis ; Representations ; Sample Size ; Single-nucleotide polymorphism ; Topology</subject><ispartof>Bioinformatics (Oxford, England), 2023-01, Vol.39 (1)</ispartof><rights>The Author(s) 2022. Published by Oxford University Press. 2022</rights><rights>The Author(s) 2022. Published by Oxford University Press.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c431t-2def96e0a961e78809d4933e6e977af99297270d5313b0e732f2b2dc2fddd07b3</cites><orcidid>0000-0002-3725-5459 ; 0000-0001-5252-0006</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9831052/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9831052/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,1598,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36548341$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Martelli, Pier Luigi</contributor><creatorcontrib>Vu, Thao</creatorcontrib><creatorcontrib>Litkowski, Elizabeth M</creatorcontrib><creatorcontrib>Liu, Weixuan</creatorcontrib><creatorcontrib>Pratte, Katherine A</creatorcontrib><creatorcontrib>Lange, Leslie</creatorcontrib><creatorcontrib>Bowler, Russell P</creatorcontrib><creatorcontrib>Banaei-Kashani, Farnoush</creatorcontrib><creatorcontrib>Kechris, Katerina J</creatorcontrib><title>NetSHy: network summarization via a hybrid approach leveraging topological properties</title><title>Bioinformatics (Oxford, England)</title><addtitle>Bioinformatics</addtitle><description>Abstract
Motivation
Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separate functions. In order to perform subsequent analyses to investigate the association between the identified module and a variable of interest, a module summarization, that best explains the module’s information and reduces dimensionality is often needed. Conventional approaches for obtaining network representation typically rely only on the profiles of the nodes within the network while disregarding the inherent network topological information.
Results
In this article, we propose NetSHy, a hybrid approach which is capable of reducing the dimension of a network while incorporating topological properties to aid the interpretation of the downstream analyses. In particular, NetSHy applies principal component analysis (PCA) on a combination of the node profiles and the well-known Laplacian matrix derived directly from the network similarity matrix to extract a summarization at a subject level. Simulation scenarios based on random and empirical networks at varying network sizes and sparsity levels show that NetSHy outperforms the conventional PCA approach applied directly on node profiles, in terms of recovering the true correlation with a phenotype of interest and maintaining a higher amount of explained variation in the data when networks are relatively sparse. The robustness of NetSHy is also demonstrated by a more consistent correlation with the observed phenotype as the sample size decreases. Lastly, a genome-wide association study is performed as an application of a downstream analysis, where NetSHy summarization scores on the biological networks identify more significant single nucleotide polymorphisms than the conventional network representation.
Availability and implementation
R code implementation of NetSHy is available at https://github.com/thaovu1/NetSHy
Supplementary information
Supplementary data are available at Bioinformatics online.</description><subject>Availability</subject><subject>Bioinformatics</subject><subject>Biological activity</subject><subject>Biological properties</subject><subject>Computer Simulation</subject><subject>Genome-wide association studies</subject><subject>Genome-Wide Association Study</subject><subject>Genomic analysis</subject><subject>Modularity</subject><subject>Modules</subject><subject>Networks</subject><subject>Nodes</subject><subject>Nucleotides</subject><subject>Original Paper</subject><subject>Phenotypes</subject><subject>Phenotypic variations</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Principal Component Analysis</subject><subject>Principal components analysis</subject><subject>Representations</subject><subject>Sample Size</subject><subject>Single-nucleotide polymorphism</subject><subject>Topology</subject><issn>1367-4811</issn><issn>1367-4803</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><sourceid>EIF</sourceid><recordid>eNqNkU1P3DAQhq0KVOi2fwFZ6qWX7fojieMekKoVhUoIDpSz5diTXdMkDrazaPvrMdoFQU-cbGmeeTQzL0InlHynRPJF47wbWh96nZyJiyZpU9P6AzqmvBLzoqb04NX_CH2K8Y4QUpKy-oiOeFUWNS_oMbq9gnRzsf2BB0gPPvzFcep7Hdy_7PUD3jiNNV5vm-As1uMYvDZr3MEGgl65YYWTH33nV87oDufqCCE5iJ_RYau7CF_27wzd_jr7s7yYX16f_17-vJybgtM0ZxZaWQHRsqIg6ppIW0jOoQIphG6lZFIwQWzJKW8ICM5a1jBrWGutJaLhM3S6845T04M1MKSgOzUGl3fYKq-delsZ3Fqt_EbJmlNSsiz4thcEfz9BTKp30UDX6QH8FBUTpcgnK-oio1__Q-_8FIa8nsrj8SxjBc9UtaNM8DEGaF-GoUQ9JafeJqf2yeXGk9ervLQ9R5UBugP8NL5X-ghJN65L</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Vu, Thao</creator><creator>Litkowski, Elizabeth M</creator><creator>Liu, Weixuan</creator><creator>Pratte, Katherine A</creator><creator>Lange, Leslie</creator><creator>Bowler, Russell P</creator><creator>Banaei-Kashani, Farnoush</creator><creator>Kechris, Katerina J</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TM</scope><scope>7TO</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>H8G</scope><scope>H94</scope><scope>JG9</scope><scope>JQ2</scope><scope>K9.</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-3725-5459</orcidid><orcidid>https://orcid.org/0000-0001-5252-0006</orcidid></search><sort><creationdate>20230101</creationdate><title>NetSHy: network summarization via a hybrid approach leveraging topological properties</title><author>Vu, Thao ; Litkowski, Elizabeth M ; Liu, Weixuan ; Pratte, Katherine A ; Lange, Leslie ; Bowler, Russell P ; Banaei-Kashani, Farnoush ; Kechris, Katerina J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c431t-2def96e0a961e78809d4933e6e977af99297270d5313b0e732f2b2dc2fddd07b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Availability</topic><topic>Bioinformatics</topic><topic>Biological activity</topic><topic>Biological properties</topic><topic>Computer Simulation</topic><topic>Genome-wide association studies</topic><topic>Genome-Wide Association Study</topic><topic>Genomic analysis</topic><topic>Modularity</topic><topic>Modules</topic><topic>Networks</topic><topic>Nodes</topic><topic>Nucleotides</topic><topic>Original Paper</topic><topic>Phenotypes</topic><topic>Phenotypic variations</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Principal Component Analysis</topic><topic>Principal components analysis</topic><topic>Representations</topic><topic>Sample Size</topic><topic>Single-nucleotide polymorphism</topic><topic>Topology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Vu, Thao</creatorcontrib><creatorcontrib>Litkowski, Elizabeth M</creatorcontrib><creatorcontrib>Liu, Weixuan</creatorcontrib><creatorcontrib>Pratte, Katherine A</creatorcontrib><creatorcontrib>Lange, Leslie</creatorcontrib><creatorcontrib>Bowler, Russell P</creatorcontrib><creatorcontrib>Banaei-Kashani, Farnoush</creatorcontrib><creatorcontrib>Kechris, Katerina J</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics (Oxford, England)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Vu, Thao</au><au>Litkowski, Elizabeth M</au><au>Liu, Weixuan</au><au>Pratte, Katherine A</au><au>Lange, Leslie</au><au>Bowler, Russell P</au><au>Banaei-Kashani, Farnoush</au><au>Kechris, Katerina J</au><au>Martelli, Pier Luigi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>NetSHy: network summarization via a hybrid approach leveraging topological properties</atitle><jtitle>Bioinformatics (Oxford, England)</jtitle><addtitle>Bioinformatics</addtitle><date>2023-01-01</date><risdate>2023</risdate><volume>39</volume><issue>1</issue><issn>1367-4811</issn><issn>1367-4803</issn><eissn>1367-4811</eissn><abstract>Abstract
Motivation
Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separate functions. In order to perform subsequent analyses to investigate the association between the identified module and a variable of interest, a module summarization, that best explains the module’s information and reduces dimensionality is often needed. Conventional approaches for obtaining network representation typically rely only on the profiles of the nodes within the network while disregarding the inherent network topological information.
Results
In this article, we propose NetSHy, a hybrid approach which is capable of reducing the dimension of a network while incorporating topological properties to aid the interpretation of the downstream analyses. In particular, NetSHy applies principal component analysis (PCA) on a combination of the node profiles and the well-known Laplacian matrix derived directly from the network similarity matrix to extract a summarization at a subject level. Simulation scenarios based on random and empirical networks at varying network sizes and sparsity levels show that NetSHy outperforms the conventional PCA approach applied directly on node profiles, in terms of recovering the true correlation with a phenotype of interest and maintaining a higher amount of explained variation in the data when networks are relatively sparse. The robustness of NetSHy is also demonstrated by a more consistent correlation with the observed phenotype as the sample size decreases. Lastly, a genome-wide association study is performed as an application of a downstream analysis, where NetSHy summarization scores on the biological networks identify more significant single nucleotide polymorphisms than the conventional network representation.
Availability and implementation
R code implementation of NetSHy is available at https://github.com/thaovu1/NetSHy
Supplementary information
Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>36548341</pmid><doi>10.1093/bioinformatics/btac818</doi><orcidid>https://orcid.org/0000-0002-3725-5459</orcidid><orcidid>https://orcid.org/0000-0001-5252-0006</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1367-4811 |
ispartof | Bioinformatics (Oxford, England), 2023-01, Vol.39 (1) |
issn | 1367-4811 1367-4803 1367-4811 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9831052 |
source | Oxford Journals Open Access Collection; MEDLINE; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central; Alma/SFX Local Collection |
subjects | Availability Bioinformatics Biological activity Biological properties Computer Simulation Genome-wide association studies Genome-Wide Association Study Genomic analysis Modularity Modules Networks Nodes Nucleotides Original Paper Phenotypes Phenotypic variations Polymorphism, Single Nucleotide Principal Component Analysis Principal components analysis Representations Sample Size Single-nucleotide polymorphism Topology |
title | NetSHy: network summarization via a hybrid approach leveraging topological properties |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T00%3A08%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=NetSHy:%20network%20summarization%20via%20a%20hybrid%20approach%20leveraging%20topological%20properties&rft.jtitle=Bioinformatics%20(Oxford,%20England)&rft.au=Vu,%20Thao&rft.date=2023-01-01&rft.volume=39&rft.issue=1&rft.issn=1367-4811&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/btac818&rft_dat=%3Cproquest_pubme%3E2757056484%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3133523243&rft_id=info:pmid/36548341&rft_oup_id=10.1093/bioinformatics/btac818&rfr_iscdi=true |