NetSHy: network summarization via a hybrid approach leveraging topological properties

Abstract Motivation Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics (Oxford, England) England), 2023-01, Vol.39 (1)
Hauptverfasser: Vu, Thao, Litkowski, Elizabeth M, Liu, Weixuan, Pratte, Katherine A, Lange, Leslie, Bowler, Russell P, Banaei-Kashani, Farnoush, Kechris, Katerina J
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page
container_title Bioinformatics (Oxford, England)
container_volume 39
creator Vu, Thao
Litkowski, Elizabeth M
Liu, Weixuan
Pratte, Katherine A
Lange, Leslie
Bowler, Russell P
Banaei-Kashani, Farnoush
Kechris, Katerina J
description Abstract Motivation Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separate functions. In order to perform subsequent analyses to investigate the association between the identified module and a variable of interest, a module summarization, that best explains the module’s information and reduces dimensionality is often needed. Conventional approaches for obtaining network representation typically rely only on the profiles of the nodes within the network while disregarding the inherent network topological information. Results In this article, we propose NetSHy, a hybrid approach which is capable of reducing the dimension of a network while incorporating topological properties to aid the interpretation of the downstream analyses. In particular, NetSHy applies principal component analysis (PCA) on a combination of the node profiles and the well-known Laplacian matrix derived directly from the network similarity matrix to extract a summarization at a subject level. Simulation scenarios based on random and empirical networks at varying network sizes and sparsity levels show that NetSHy outperforms the conventional PCA approach applied directly on node profiles, in terms of recovering the true correlation with a phenotype of interest and maintaining a higher amount of explained variation in the data when networks are relatively sparse. The robustness of NetSHy is also demonstrated by a more consistent correlation with the observed phenotype as the sample size decreases. Lastly, a genome-wide association study is performed as an application of a downstream analysis, where NetSHy summarization scores on the biological networks identify more significant single nucleotide polymorphisms than the conventional network representation. Availability and implementation R code implementation of NetSHy is available at https://github.com/thaovu1/NetSHy Supplementary information Supplementary data are available at Bioinformatics online.
doi_str_mv 10.1093/bioinformatics/btac818
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9831052</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btac818</oup_id><sourcerecordid>2757056484</sourcerecordid><originalsourceid>FETCH-LOGICAL-c431t-2def96e0a961e78809d4933e6e977af99297270d5313b0e732f2b2dc2fddd07b3</originalsourceid><addsrcrecordid>eNqNkU1P3DAQhq0KVOi2fwFZ6qWX7fojieMekKoVhUoIDpSz5diTXdMkDrazaPvrMdoFQU-cbGmeeTQzL0InlHynRPJF47wbWh96nZyJiyZpU9P6AzqmvBLzoqb04NX_CH2K8Y4QUpKy-oiOeFUWNS_oMbq9gnRzsf2BB0gPPvzFcep7Hdy_7PUD3jiNNV5vm-As1uMYvDZr3MEGgl65YYWTH33nV87oDufqCCE5iJ_RYau7CF_27wzd_jr7s7yYX16f_17-vJybgtM0ZxZaWQHRsqIg6ppIW0jOoQIphG6lZFIwQWzJKW8ICM5a1jBrWGutJaLhM3S6845T04M1MKSgOzUGl3fYKq-delsZ3Fqt_EbJmlNSsiz4thcEfz9BTKp30UDX6QH8FBUTpcgnK-oio1__Q-_8FIa8nsrj8SxjBc9UtaNM8DEGaF-GoUQ9JafeJqf2yeXGk9ervLQ9R5UBugP8NL5X-ghJN65L</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3133523243</pqid></control><display><type>article</type><title>NetSHy: network summarization via a hybrid approach leveraging topological properties</title><source>Oxford Journals Open Access Collection</source><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Vu, Thao ; Litkowski, Elizabeth M ; Liu, Weixuan ; Pratte, Katherine A ; Lange, Leslie ; Bowler, Russell P ; Banaei-Kashani, Farnoush ; Kechris, Katerina J</creator><contributor>Martelli, Pier Luigi</contributor><creatorcontrib>Vu, Thao ; Litkowski, Elizabeth M ; Liu, Weixuan ; Pratte, Katherine A ; Lange, Leslie ; Bowler, Russell P ; Banaei-Kashani, Farnoush ; Kechris, Katerina J ; Martelli, Pier Luigi</creatorcontrib><description>Abstract Motivation Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separate functions. In order to perform subsequent analyses to investigate the association between the identified module and a variable of interest, a module summarization, that best explains the module’s information and reduces dimensionality is often needed. Conventional approaches for obtaining network representation typically rely only on the profiles of the nodes within the network while disregarding the inherent network topological information. Results In this article, we propose NetSHy, a hybrid approach which is capable of reducing the dimension of a network while incorporating topological properties to aid the interpretation of the downstream analyses. In particular, NetSHy applies principal component analysis (PCA) on a combination of the node profiles and the well-known Laplacian matrix derived directly from the network similarity matrix to extract a summarization at a subject level. Simulation scenarios based on random and empirical networks at varying network sizes and sparsity levels show that NetSHy outperforms the conventional PCA approach applied directly on node profiles, in terms of recovering the true correlation with a phenotype of interest and maintaining a higher amount of explained variation in the data when networks are relatively sparse. The robustness of NetSHy is also demonstrated by a more consistent correlation with the observed phenotype as the sample size decreases. Lastly, a genome-wide association study is performed as an application of a downstream analysis, where NetSHy summarization scores on the biological networks identify more significant single nucleotide polymorphisms than the conventional network representation. Availability and implementation R code implementation of NetSHy is available at https://github.com/thaovu1/NetSHy Supplementary information Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4811</identifier><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btac818</identifier><identifier>PMID: 36548341</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Availability ; Bioinformatics ; Biological activity ; Biological properties ; Computer Simulation ; Genome-wide association studies ; Genome-Wide Association Study ; Genomic analysis ; Modularity ; Modules ; Networks ; Nodes ; Nucleotides ; Original Paper ; Phenotypes ; Phenotypic variations ; Polymorphism, Single Nucleotide ; Principal Component Analysis ; Principal components analysis ; Representations ; Sample Size ; Single-nucleotide polymorphism ; Topology</subject><ispartof>Bioinformatics (Oxford, England), 2023-01, Vol.39 (1)</ispartof><rights>The Author(s) 2022. Published by Oxford University Press. 2022</rights><rights>The Author(s) 2022. Published by Oxford University Press.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c431t-2def96e0a961e78809d4933e6e977af99297270d5313b0e732f2b2dc2fddd07b3</cites><orcidid>0000-0002-3725-5459 ; 0000-0001-5252-0006</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9831052/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9831052/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,1598,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36548341$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Martelli, Pier Luigi</contributor><creatorcontrib>Vu, Thao</creatorcontrib><creatorcontrib>Litkowski, Elizabeth M</creatorcontrib><creatorcontrib>Liu, Weixuan</creatorcontrib><creatorcontrib>Pratte, Katherine A</creatorcontrib><creatorcontrib>Lange, Leslie</creatorcontrib><creatorcontrib>Bowler, Russell P</creatorcontrib><creatorcontrib>Banaei-Kashani, Farnoush</creatorcontrib><creatorcontrib>Kechris, Katerina J</creatorcontrib><title>NetSHy: network summarization via a hybrid approach leveraging topological properties</title><title>Bioinformatics (Oxford, England)</title><addtitle>Bioinformatics</addtitle><description>Abstract Motivation Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separate functions. In order to perform subsequent analyses to investigate the association between the identified module and a variable of interest, a module summarization, that best explains the module’s information and reduces dimensionality is often needed. Conventional approaches for obtaining network representation typically rely only on the profiles of the nodes within the network while disregarding the inherent network topological information. Results In this article, we propose NetSHy, a hybrid approach which is capable of reducing the dimension of a network while incorporating topological properties to aid the interpretation of the downstream analyses. In particular, NetSHy applies principal component analysis (PCA) on a combination of the node profiles and the well-known Laplacian matrix derived directly from the network similarity matrix to extract a summarization at a subject level. Simulation scenarios based on random and empirical networks at varying network sizes and sparsity levels show that NetSHy outperforms the conventional PCA approach applied directly on node profiles, in terms of recovering the true correlation with a phenotype of interest and maintaining a higher amount of explained variation in the data when networks are relatively sparse. The robustness of NetSHy is also demonstrated by a more consistent correlation with the observed phenotype as the sample size decreases. Lastly, a genome-wide association study is performed as an application of a downstream analysis, where NetSHy summarization scores on the biological networks identify more significant single nucleotide polymorphisms than the conventional network representation. Availability and implementation R code implementation of NetSHy is available at https://github.com/thaovu1/NetSHy Supplementary information Supplementary data are available at Bioinformatics online.</description><subject>Availability</subject><subject>Bioinformatics</subject><subject>Biological activity</subject><subject>Biological properties</subject><subject>Computer Simulation</subject><subject>Genome-wide association studies</subject><subject>Genome-Wide Association Study</subject><subject>Genomic analysis</subject><subject>Modularity</subject><subject>Modules</subject><subject>Networks</subject><subject>Nodes</subject><subject>Nucleotides</subject><subject>Original Paper</subject><subject>Phenotypes</subject><subject>Phenotypic variations</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Principal Component Analysis</subject><subject>Principal components analysis</subject><subject>Representations</subject><subject>Sample Size</subject><subject>Single-nucleotide polymorphism</subject><subject>Topology</subject><issn>1367-4811</issn><issn>1367-4803</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><sourceid>EIF</sourceid><recordid>eNqNkU1P3DAQhq0KVOi2fwFZ6qWX7fojieMekKoVhUoIDpSz5diTXdMkDrazaPvrMdoFQU-cbGmeeTQzL0InlHynRPJF47wbWh96nZyJiyZpU9P6AzqmvBLzoqb04NX_CH2K8Y4QUpKy-oiOeFUWNS_oMbq9gnRzsf2BB0gPPvzFcep7Hdy_7PUD3jiNNV5vm-As1uMYvDZr3MEGgl65YYWTH33nV87oDufqCCE5iJ_RYau7CF_27wzd_jr7s7yYX16f_17-vJybgtM0ZxZaWQHRsqIg6ppIW0jOoQIphG6lZFIwQWzJKW8ICM5a1jBrWGutJaLhM3S6845T04M1MKSgOzUGl3fYKq-delsZ3Fqt_EbJmlNSsiz4thcEfz9BTKp30UDX6QH8FBUTpcgnK-oio1__Q-_8FIa8nsrj8SxjBc9UtaNM8DEGaF-GoUQ9JafeJqf2yeXGk9ervLQ9R5UBugP8NL5X-ghJN65L</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Vu, Thao</creator><creator>Litkowski, Elizabeth M</creator><creator>Liu, Weixuan</creator><creator>Pratte, Katherine A</creator><creator>Lange, Leslie</creator><creator>Bowler, Russell P</creator><creator>Banaei-Kashani, Farnoush</creator><creator>Kechris, Katerina J</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TM</scope><scope>7TO</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>H8G</scope><scope>H94</scope><scope>JG9</scope><scope>JQ2</scope><scope>K9.</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-3725-5459</orcidid><orcidid>https://orcid.org/0000-0001-5252-0006</orcidid></search><sort><creationdate>20230101</creationdate><title>NetSHy: network summarization via a hybrid approach leveraging topological properties</title><author>Vu, Thao ; Litkowski, Elizabeth M ; Liu, Weixuan ; Pratte, Katherine A ; Lange, Leslie ; Bowler, Russell P ; Banaei-Kashani, Farnoush ; Kechris, Katerina J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c431t-2def96e0a961e78809d4933e6e977af99297270d5313b0e732f2b2dc2fddd07b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Availability</topic><topic>Bioinformatics</topic><topic>Biological activity</topic><topic>Biological properties</topic><topic>Computer Simulation</topic><topic>Genome-wide association studies</topic><topic>Genome-Wide Association Study</topic><topic>Genomic analysis</topic><topic>Modularity</topic><topic>Modules</topic><topic>Networks</topic><topic>Nodes</topic><topic>Nucleotides</topic><topic>Original Paper</topic><topic>Phenotypes</topic><topic>Phenotypic variations</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Principal Component Analysis</topic><topic>Principal components analysis</topic><topic>Representations</topic><topic>Sample Size</topic><topic>Single-nucleotide polymorphism</topic><topic>Topology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Vu, Thao</creatorcontrib><creatorcontrib>Litkowski, Elizabeth M</creatorcontrib><creatorcontrib>Liu, Weixuan</creatorcontrib><creatorcontrib>Pratte, Katherine A</creatorcontrib><creatorcontrib>Lange, Leslie</creatorcontrib><creatorcontrib>Bowler, Russell P</creatorcontrib><creatorcontrib>Banaei-Kashani, Farnoush</creatorcontrib><creatorcontrib>Kechris, Katerina J</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics (Oxford, England)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Vu, Thao</au><au>Litkowski, Elizabeth M</au><au>Liu, Weixuan</au><au>Pratte, Katherine A</au><au>Lange, Leslie</au><au>Bowler, Russell P</au><au>Banaei-Kashani, Farnoush</au><au>Kechris, Katerina J</au><au>Martelli, Pier Luigi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>NetSHy: network summarization via a hybrid approach leveraging topological properties</atitle><jtitle>Bioinformatics (Oxford, England)</jtitle><addtitle>Bioinformatics</addtitle><date>2023-01-01</date><risdate>2023</risdate><volume>39</volume><issue>1</issue><issn>1367-4811</issn><issn>1367-4803</issn><eissn>1367-4811</eissn><abstract>Abstract Motivation Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separate functions. In order to perform subsequent analyses to investigate the association between the identified module and a variable of interest, a module summarization, that best explains the module’s information and reduces dimensionality is often needed. Conventional approaches for obtaining network representation typically rely only on the profiles of the nodes within the network while disregarding the inherent network topological information. Results In this article, we propose NetSHy, a hybrid approach which is capable of reducing the dimension of a network while incorporating topological properties to aid the interpretation of the downstream analyses. In particular, NetSHy applies principal component analysis (PCA) on a combination of the node profiles and the well-known Laplacian matrix derived directly from the network similarity matrix to extract a summarization at a subject level. Simulation scenarios based on random and empirical networks at varying network sizes and sparsity levels show that NetSHy outperforms the conventional PCA approach applied directly on node profiles, in terms of recovering the true correlation with a phenotype of interest and maintaining a higher amount of explained variation in the data when networks are relatively sparse. The robustness of NetSHy is also demonstrated by a more consistent correlation with the observed phenotype as the sample size decreases. Lastly, a genome-wide association study is performed as an application of a downstream analysis, where NetSHy summarization scores on the biological networks identify more significant single nucleotide polymorphisms than the conventional network representation. Availability and implementation R code implementation of NetSHy is available at https://github.com/thaovu1/NetSHy Supplementary information Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>36548341</pmid><doi>10.1093/bioinformatics/btac818</doi><orcidid>https://orcid.org/0000-0002-3725-5459</orcidid><orcidid>https://orcid.org/0000-0001-5252-0006</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4811
ispartof Bioinformatics (Oxford, England), 2023-01, Vol.39 (1)
issn 1367-4811
1367-4803
1367-4811
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9831052
source Oxford Journals Open Access Collection; MEDLINE; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central; Alma/SFX Local Collection
subjects Availability
Bioinformatics
Biological activity
Biological properties
Computer Simulation
Genome-wide association studies
Genome-Wide Association Study
Genomic analysis
Modularity
Modules
Networks
Nodes
Nucleotides
Original Paper
Phenotypes
Phenotypic variations
Polymorphism, Single Nucleotide
Principal Component Analysis
Principal components analysis
Representations
Sample Size
Single-nucleotide polymorphism
Topology
title NetSHy: network summarization via a hybrid approach leveraging topological properties
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T00%3A08%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=NetSHy:%20network%20summarization%20via%20a%20hybrid%20approach%20leveraging%20topological%20properties&rft.jtitle=Bioinformatics%20(Oxford,%20England)&rft.au=Vu,%20Thao&rft.date=2023-01-01&rft.volume=39&rft.issue=1&rft.issn=1367-4811&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/btac818&rft_dat=%3Cproquest_pubme%3E2757056484%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3133523243&rft_id=info:pmid/36548341&rft_oup_id=10.1093/bioinformatics/btac818&rfr_iscdi=true