Network principal component analysis: a versatile tool for the investigation of multigroup and multiblock datasets

Abstract Motivation Complex data structures composed of different groups of observations and blocks of variables are increasingly collected in many domains, including metabolomics. Analysing these high-dimensional data constitutes a challenge, and the objective of this article is to present an origi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Bioinformatics 2021-06, Vol.37 (9), p.1297-1303
Hauptverfasser:	Codesido, Santiago, Hanafi, Mohamed, Gagnebin, Yoric, González-Ruiz, Víctor, Rudaz, Serge, Boccard, Julien
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1303
container_issue	9
container_start_page	1297
container_title	Bioinformatics
container_volume	37
creator	Codesido, Santiago Hanafi, Mohamed Gagnebin, Yoric González-Ruiz, Víctor Rudaz, Serge Boccard, Julien
description	Abstract Motivation Complex data structures composed of different groups of observations and blocks of variables are increasingly collected in many domains, including metabolomics. Analysing these high-dimensional data constitutes a challenge, and the objective of this article is to present an original multivariate method capable of explicitly taking into account links between data tables when they involve the same observations and/or variables. For that purpose, an extension of standard principal component analysis called NetPCA was developed. Results The proposed algorithm was illustrated as an efficient solution for addressing complex multigroup and multiblock datasets. A case study involving the analysis of metabolomic data with different annotation levels and originating from a chronic kidney disease (CKD) study was used to highlight the different aspects and the additional outputs of the method compared to standard PCA. On the one hand, the model parameters allowed an efficient evaluation of each group’s influence to be performed. On the other hand, the relative relevance of each block of variables to the model provided decisive information for an objective interpretation of the different metabolic annotation levels. Availability and implementation NetPCA is available as a Python package with NumPy dependencies. Supplementary information Supplementary data are available at Bioinformatics online.
doi_str_mv	10.1093/bioinformatics/btaa954
format	Article
fullrecord	<record><control><sourceid>proquest_TOX</sourceid><recordid>TN_cdi_proquest_miscellaneous_2458964787</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btaa954</oup_id><sourcerecordid>2458964787</sourcerecordid><originalsourceid>FETCH-LOGICAL-c330t-b12ddf57a07867950d02d05bb50d54a275f9178c802d73a588608a602bac42a83</originalsourceid><addsrcrecordid>eNqNUMtOwzAQjBBIlMIvIB-5hNqJHTvcUMVLquAC52jjOGDq2MF2ivr3GKUXbpx2Z3dmH5NllwRfE1yXq1Y7bXvnB4hahlUbAWpGj7IFoRXOC8zq45SXFc-pwOVpdhbCJ8aMUEoXmX9W8dv5LRq9tlKPYJB0w-isshGBBbMPOtwgQDvlQ1pgFIrOGZT2ofihkLY7FaJ-Ty1nkevRMJkEvZvGJO9m2Bont6iDCEHFcJ6d9GCCujjEZfZ2f_e6fsw3Lw9P69tNLssSx7wlRdf1jAPmouI1wx0uOszaNmWMQsFZXxMupEhlXgITosICKly0IGkBolxmV_Pc0buvKV3ZDDpIZQxY5abQFJSJuqJc8EStZqr0LgSv-ibZMYDfNwQ3vyY3f01uDiYnIZmF6d__an4A6pqKVw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2458964787</pqid></control><display><type>article</type><title>Network principal component analysis: a versatile tool for the investigation of multigroup and multiblock datasets</title><source>Oxford Journals Open Access Collection</source><creator>Codesido, Santiago ; Hanafi, Mohamed ; Gagnebin, Yoric ; González-Ruiz, Víctor ; Rudaz, Serge ; Boccard, Julien</creator><creatorcontrib>Codesido, Santiago ; Hanafi, Mohamed ; Gagnebin, Yoric ; González-Ruiz, Víctor ; Rudaz, Serge ; Boccard, Julien</creatorcontrib><description>Abstract Motivation Complex data structures composed of different groups of observations and blocks of variables are increasingly collected in many domains, including metabolomics. Analysing these high-dimensional data constitutes a challenge, and the objective of this article is to present an original multivariate method capable of explicitly taking into account links between data tables when they involve the same observations and/or variables. For that purpose, an extension of standard principal component analysis called NetPCA was developed. Results The proposed algorithm was illustrated as an efficient solution for addressing complex multigroup and multiblock datasets. A case study involving the analysis of metabolomic data with different annotation levels and originating from a chronic kidney disease (CKD) study was used to highlight the different aspects and the additional outputs of the method compared to standard PCA. On the one hand, the model parameters allowed an efficient evaluation of each group’s influence to be performed. On the other hand, the relative relevance of each block of variables to the model provided decisive information for an objective interpretation of the different metabolic annotation levels. Availability and implementation NetPCA is available as a Python package with NumPy dependencies. Supplementary information Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btaa954</identifier><language>eng</language><publisher>Oxford University Press</publisher><ispartof>Bioinformatics, 2021-06, Vol.37 (9), p.1297-1303</ispartof><rights>The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c330t-b12ddf57a07867950d02d05bb50d54a275f9178c802d73a588608a602bac42a83</citedby><cites>FETCH-LOGICAL-c330t-b12ddf57a07867950d02d05bb50d54a275f9178c802d73a588608a602bac42a83</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,1598,27901,27902</link.rule.ids><linktorsrc>$$Uhttps://dx.doi.org/10.1093/bioinformatics/btaa954$$EView_record_in_Oxford_University_Press$$FView_record_in_$$GOxford_University_Press</linktorsrc></links><search><creatorcontrib>Codesido, Santiago</creatorcontrib><creatorcontrib>Hanafi, Mohamed</creatorcontrib><creatorcontrib>Gagnebin, Yoric</creatorcontrib><creatorcontrib>González-Ruiz, Víctor</creatorcontrib><creatorcontrib>Rudaz, Serge</creatorcontrib><creatorcontrib>Boccard, Julien</creatorcontrib><title>Network principal component analysis: a versatile tool for the investigation of multigroup and multiblock datasets</title><title>Bioinformatics</title><description>Abstract Motivation Complex data structures composed of different groups of observations and blocks of variables are increasingly collected in many domains, including metabolomics. Analysing these high-dimensional data constitutes a challenge, and the objective of this article is to present an original multivariate method capable of explicitly taking into account links between data tables when they involve the same observations and/or variables. For that purpose, an extension of standard principal component analysis called NetPCA was developed. Results The proposed algorithm was illustrated as an efficient solution for addressing complex multigroup and multiblock datasets. A case study involving the analysis of metabolomic data with different annotation levels and originating from a chronic kidney disease (CKD) study was used to highlight the different aspects and the additional outputs of the method compared to standard PCA. On the one hand, the model parameters allowed an efficient evaluation of each group’s influence to be performed. On the other hand, the relative relevance of each block of variables to the model provided decisive information for an objective interpretation of the different metabolic annotation levels. Availability and implementation NetPCA is available as a Python package with NumPy dependencies. Supplementary information Supplementary data are available at Bioinformatics online.</description><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNqNUMtOwzAQjBBIlMIvIB-5hNqJHTvcUMVLquAC52jjOGDq2MF2ivr3GKUXbpx2Z3dmH5NllwRfE1yXq1Y7bXvnB4hahlUbAWpGj7IFoRXOC8zq45SXFc-pwOVpdhbCJ8aMUEoXmX9W8dv5LRq9tlKPYJB0w-isshGBBbMPOtwgQDvlQ1pgFIrOGZT2ofihkLY7FaJ-Ty1nkevRMJkEvZvGJO9m2Bont6iDCEHFcJ6d9GCCujjEZfZ2f_e6fsw3Lw9P69tNLssSx7wlRdf1jAPmouI1wx0uOszaNmWMQsFZXxMupEhlXgITosICKly0IGkBolxmV_Pc0buvKV3ZDDpIZQxY5abQFJSJuqJc8EStZqr0LgSv-ibZMYDfNwQ3vyY3f01uDiYnIZmF6d__an4A6pqKVw</recordid><startdate>20210609</startdate><enddate>20210609</enddate><creator>Codesido, Santiago</creator><creator>Hanafi, Mohamed</creator><creator>Gagnebin, Yoric</creator><creator>González-Ruiz, Víctor</creator><creator>Rudaz, Serge</creator><creator>Boccard, Julien</creator><general>Oxford University Press</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20210609</creationdate><title>Network principal component analysis: a versatile tool for the investigation of multigroup and multiblock datasets</title><author>Codesido, Santiago ; Hanafi, Mohamed ; Gagnebin, Yoric ; González-Ruiz, Víctor ; Rudaz, Serge ; Boccard, Julien</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c330t-b12ddf57a07867950d02d05bb50d54a275f9178c802d73a588608a602bac42a83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Codesido, Santiago</creatorcontrib><creatorcontrib>Hanafi, Mohamed</creatorcontrib><creatorcontrib>Gagnebin, Yoric</creatorcontrib><creatorcontrib>González-Ruiz, Víctor</creatorcontrib><creatorcontrib>Rudaz, Serge</creatorcontrib><creatorcontrib>Boccard, Julien</creatorcontrib><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Codesido, Santiago</au><au>Hanafi, Mohamed</au><au>Gagnebin, Yoric</au><au>González-Ruiz, Víctor</au><au>Rudaz, Serge</au><au>Boccard, Julien</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Network principal component analysis: a versatile tool for the investigation of multigroup and multiblock datasets</atitle><jtitle>Bioinformatics</jtitle><date>2021-06-09</date><risdate>2021</risdate><volume>37</volume><issue>9</issue><spage>1297</spage><epage>1303</epage><pages>1297-1303</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><abstract>Abstract Motivation Complex data structures composed of different groups of observations and blocks of variables are increasingly collected in many domains, including metabolomics. Analysing these high-dimensional data constitutes a challenge, and the objective of this article is to present an original multivariate method capable of explicitly taking into account links between data tables when they involve the same observations and/or variables. For that purpose, an extension of standard principal component analysis called NetPCA was developed. Results The proposed algorithm was illustrated as an efficient solution for addressing complex multigroup and multiblock datasets. A case study involving the analysis of metabolomic data with different annotation levels and originating from a chronic kidney disease (CKD) study was used to highlight the different aspects and the additional outputs of the method compared to standard PCA. On the one hand, the model parameters allowed an efficient evaluation of each group’s influence to be performed. On the other hand, the relative relevance of each block of variables to the model provided decisive information for an objective interpretation of the different metabolic annotation levels. Availability and implementation NetPCA is available as a Python package with NumPy dependencies. Supplementary information Supplementary data are available at Bioinformatics online.</abstract><pub>Oxford University Press</pub><doi>10.1093/bioinformatics/btaa954</doi><tpages>7</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1367-4803
ispartof	Bioinformatics, 2021-06, Vol.37 (9), p.1297-1303
issn	1367-4803 1460-2059 1367-4811
language	eng
recordid	cdi_proquest_miscellaneous_2458964787
source	Oxford Journals Open Access Collection
title	Network principal component analysis: a versatile tool for the investigation of multigroup and multiblock datasets
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-12T02%3A28%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_TOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Network%20principal%20component%20analysis:%20a%20versatile%20tool%20for%20the%20investigation%20of%20multigroup%20and%20multiblock%20datasets&rft.jtitle=Bioinformatics&rft.au=Codesido,%20Santiago&rft.date=2021-06-09&rft.volume=37&rft.issue=9&rft.spage=1297&rft.epage=1303&rft.pages=1297-1303&rft.issn=1367-4803&rft.eissn=1460-2059&rft_id=info:doi/10.1093/bioinformatics/btaa954&rft_dat=%3Cproquest_TOX%3E2458964787%3C/proquest_TOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2458964787&rft_id=info:pmid/&rft_oup_id=10.1093/bioinformatics/btaa954&rfr_iscdi=true