limpca: An R package for the linear modeling of high‐dimensional designed data based on ASCA/APCA family of methods

Summary Many modern analytical methods are used to analyze samples issued from an experimental design, for example, in medical, biological, chemical, or agronomic fields. Those methods generate most of the time, highly multivariate data like spectra or images, where the number of variables (descript...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of chemometrics 2023-07, Vol.37 (7), p.n/a
Hauptverfasser: Thiel, Michel, Benaiche, Nadia, Martin, Manon, Franceschini, Sébastien, Van Oirbeek, Robin, Govaerts, Bernadette
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page n/a
container_issue 7
container_start_page
container_title Journal of chemometrics
container_volume 37
creator Thiel, Michel
Benaiche, Nadia
Martin, Manon
Franceschini, Sébastien
Van Oirbeek, Robin
Govaerts, Bernadette
description Summary Many modern analytical methods are used to analyze samples issued from an experimental design, for example, in medical, biological, chemical, or agronomic fields. Those methods generate most of the time, highly multivariate data like spectra or images, where the number of variables (descriptor responses) tends to be much larger than the number of experimental units. Therefore, multivariate statistical tools are necessary to identify variables that are consistently affected by experimental factors. In this context, two recent methods combining ANOVA and PCA, namely, ASCA (ANOVA‐Simultaneous Component Analysis) and APCA (ANOVA‐Principal Component Analysis), were developed. They provide powerful visualization tools for multivariate structures in the space of each effect of the statistical model linked to the experimental design. Their main limitation is that they produce biased estimators of the factor effects when the design of experiment is unbalanced. This article presents the R package limpca (for linear models with principal component effects analysis) that implements ASCA+ and APCA+, an enhanced version of ASCA and APCA methods based on several principles from the theory of general linear models (GLM). In this paper, the methodology is reviewed, the package structure and functions are presented, and a metabolomics data set is used to clearly demonstrate the potential of ASCA+ and APCA+ methods to highlight true biomarkers corresponding to effects of interest in unbalanced designs. This article presents the R package limpca that implements ASCA+ and APCA+ methods. Those two methods extend the use of ASCA and APCA to unbalanced designs with several principles from the theory of general linear models. The package structure is presented and applied on metabolomics data to highlight biomarkers corresponding to effects of interest in an experimental design.
doi_str_mv 10.1002/cem.3482
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2833744237</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2833744237</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2932-7618323f5f50398a45896ff695d64b1d7adbef8b3a304f46518295b6d52fea9a3</originalsourceid><addsrcrecordid>eNp1kMtKxDAUhoMoOI6CjxBw46YzaZK2ibtSvMGI4gXchbRJ2oxtMyYdZHY-gs_ok9hx3Lo6P5zvPxw-AE5jNIsRwvNKdzNCGd4DkxhxHsWYve6DCWIsjThh5BAchbBEaNwROgHr1narSl7AvIePcCWrN1lraJyHQ6Nha3stPeyc0mOsoTOwsXXz_fmlbKf7YF0vW6h0sHWvFVRykLCUYYyuh_lTkc_zhyKHRna23WzbnR4ap8IxODCyDfrkb07By9Xlc3ETLe6vb4t8EVWYExxlacwIJiYxCSKcSZownhqT8kSltIxVJlWpDSuJJIgamiYxwzwpU5VgoyWXZArOdndX3r2vdRjE0q39-HMQmBGSUYpJNlLnO6ryLgSvjVh520m_ETESW6lilCq2Ukc02qEfttWbfzlRXN798j-2BnfE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2833744237</pqid></control><display><type>article</type><title>limpca: An R package for the linear modeling of high‐dimensional designed data based on ASCA/APCA family of methods</title><source>Wiley Online Library Journals Frontfile Complete</source><creator>Thiel, Michel ; Benaiche, Nadia ; Martin, Manon ; Franceschini, Sébastien ; Van Oirbeek, Robin ; Govaerts, Bernadette</creator><creatorcontrib>Thiel, Michel ; Benaiche, Nadia ; Martin, Manon ; Franceschini, Sébastien ; Van Oirbeek, Robin ; Govaerts, Bernadette</creatorcontrib><description>Summary Many modern analytical methods are used to analyze samples issued from an experimental design, for example, in medical, biological, chemical, or agronomic fields. Those methods generate most of the time, highly multivariate data like spectra or images, where the number of variables (descriptor responses) tends to be much larger than the number of experimental units. Therefore, multivariate statistical tools are necessary to identify variables that are consistently affected by experimental factors. In this context, two recent methods combining ANOVA and PCA, namely, ASCA (ANOVA‐Simultaneous Component Analysis) and APCA (ANOVA‐Principal Component Analysis), were developed. They provide powerful visualization tools for multivariate structures in the space of each effect of the statistical model linked to the experimental design. Their main limitation is that they produce biased estimators of the factor effects when the design of experiment is unbalanced. This article presents the R package limpca (for linear models with principal component effects analysis) that implements ASCA+ and APCA+, an enhanced version of ASCA and APCA methods based on several principles from the theory of general linear models (GLM). In this paper, the methodology is reviewed, the package structure and functions are presented, and a metabolomics data set is used to clearly demonstrate the potential of ASCA+ and APCA+ methods to highlight true biomarkers corresponding to effects of interest in unbalanced designs. This article presents the R package limpca that implements ASCA+ and APCA+ methods. Those two methods extend the use of ASCA and APCA to unbalanced designs with several principles from the theory of general linear models. The package structure is presented and applied on metabolomics data to highlight biomarkers corresponding to effects of interest in an experimental design.</description><identifier>ISSN: 0886-9383</identifier><identifier>EISSN: 1099-128X</identifier><identifier>DOI: 10.1002/cem.3482</identifier><language>eng</language><publisher>Chichester: Wiley Subscription Services, Inc</publisher><subject>Analytical methods ; ANOVA ; APCA ; ASCA ; Biomarkers ; design of experiment ; Design of experiments ; Experimental design ; general linear model ; Multivariate analysis ; Principal components analysis ; Statistical models ; Variance analysis</subject><ispartof>Journal of chemometrics, 2023-07, Vol.37 (7), p.n/a</ispartof><rights>2023 John Wiley &amp; Sons, Ltd.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2932-7618323f5f50398a45896ff695d64b1d7adbef8b3a304f46518295b6d52fea9a3</citedby><cites>FETCH-LOGICAL-c2932-7618323f5f50398a45896ff695d64b1d7adbef8b3a304f46518295b6d52fea9a3</cites><orcidid>0000-0003-4800-0942 ; 0000-0002-0242-9623</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fcem.3482$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fcem.3482$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,777,781,1412,27905,27906,45555,45556</link.rule.ids></links><search><creatorcontrib>Thiel, Michel</creatorcontrib><creatorcontrib>Benaiche, Nadia</creatorcontrib><creatorcontrib>Martin, Manon</creatorcontrib><creatorcontrib>Franceschini, Sébastien</creatorcontrib><creatorcontrib>Van Oirbeek, Robin</creatorcontrib><creatorcontrib>Govaerts, Bernadette</creatorcontrib><title>limpca: An R package for the linear modeling of high‐dimensional designed data based on ASCA/APCA family of methods</title><title>Journal of chemometrics</title><description>Summary Many modern analytical methods are used to analyze samples issued from an experimental design, for example, in medical, biological, chemical, or agronomic fields. Those methods generate most of the time, highly multivariate data like spectra or images, where the number of variables (descriptor responses) tends to be much larger than the number of experimental units. Therefore, multivariate statistical tools are necessary to identify variables that are consistently affected by experimental factors. In this context, two recent methods combining ANOVA and PCA, namely, ASCA (ANOVA‐Simultaneous Component Analysis) and APCA (ANOVA‐Principal Component Analysis), were developed. They provide powerful visualization tools for multivariate structures in the space of each effect of the statistical model linked to the experimental design. Their main limitation is that they produce biased estimators of the factor effects when the design of experiment is unbalanced. This article presents the R package limpca (for linear models with principal component effects analysis) that implements ASCA+ and APCA+, an enhanced version of ASCA and APCA methods based on several principles from the theory of general linear models (GLM). In this paper, the methodology is reviewed, the package structure and functions are presented, and a metabolomics data set is used to clearly demonstrate the potential of ASCA+ and APCA+ methods to highlight true biomarkers corresponding to effects of interest in unbalanced designs. This article presents the R package limpca that implements ASCA+ and APCA+ methods. Those two methods extend the use of ASCA and APCA to unbalanced designs with several principles from the theory of general linear models. The package structure is presented and applied on metabolomics data to highlight biomarkers corresponding to effects of interest in an experimental design.</description><subject>Analytical methods</subject><subject>ANOVA</subject><subject>APCA</subject><subject>ASCA</subject><subject>Biomarkers</subject><subject>design of experiment</subject><subject>Design of experiments</subject><subject>Experimental design</subject><subject>general linear model</subject><subject>Multivariate analysis</subject><subject>Principal components analysis</subject><subject>Statistical models</subject><subject>Variance analysis</subject><issn>0886-9383</issn><issn>1099-128X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp1kMtKxDAUhoMoOI6CjxBw46YzaZK2ibtSvMGI4gXchbRJ2oxtMyYdZHY-gs_ok9hx3Lo6P5zvPxw-AE5jNIsRwvNKdzNCGd4DkxhxHsWYve6DCWIsjThh5BAchbBEaNwROgHr1narSl7AvIePcCWrN1lraJyHQ6Nha3stPeyc0mOsoTOwsXXz_fmlbKf7YF0vW6h0sHWvFVRykLCUYYyuh_lTkc_zhyKHRna23WzbnR4ap8IxODCyDfrkb07By9Xlc3ETLe6vb4t8EVWYExxlacwIJiYxCSKcSZownhqT8kSltIxVJlWpDSuJJIgamiYxwzwpU5VgoyWXZArOdndX3r2vdRjE0q39-HMQmBGSUYpJNlLnO6ryLgSvjVh520m_ETESW6lilCq2Ukc02qEfttWbfzlRXN798j-2BnfE</recordid><startdate>202307</startdate><enddate>202307</enddate><creator>Thiel, Michel</creator><creator>Benaiche, Nadia</creator><creator>Martin, Manon</creator><creator>Franceschini, Sébastien</creator><creator>Van Oirbeek, Robin</creator><creator>Govaerts, Bernadette</creator><general>Wiley Subscription Services, Inc</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7U5</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-4800-0942</orcidid><orcidid>https://orcid.org/0000-0002-0242-9623</orcidid></search><sort><creationdate>202307</creationdate><title>limpca: An R package for the linear modeling of high‐dimensional designed data based on ASCA/APCA family of methods</title><author>Thiel, Michel ; Benaiche, Nadia ; Martin, Manon ; Franceschini, Sébastien ; Van Oirbeek, Robin ; Govaerts, Bernadette</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2932-7618323f5f50398a45896ff695d64b1d7adbef8b3a304f46518295b6d52fea9a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Analytical methods</topic><topic>ANOVA</topic><topic>APCA</topic><topic>ASCA</topic><topic>Biomarkers</topic><topic>design of experiment</topic><topic>Design of experiments</topic><topic>Experimental design</topic><topic>general linear model</topic><topic>Multivariate analysis</topic><topic>Principal components analysis</topic><topic>Statistical models</topic><topic>Variance analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Thiel, Michel</creatorcontrib><creatorcontrib>Benaiche, Nadia</creatorcontrib><creatorcontrib>Martin, Manon</creatorcontrib><creatorcontrib>Franceschini, Sébastien</creatorcontrib><creatorcontrib>Van Oirbeek, Robin</creatorcontrib><creatorcontrib>Govaerts, Bernadette</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of chemometrics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Thiel, Michel</au><au>Benaiche, Nadia</au><au>Martin, Manon</au><au>Franceschini, Sébastien</au><au>Van Oirbeek, Robin</au><au>Govaerts, Bernadette</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>limpca: An R package for the linear modeling of high‐dimensional designed data based on ASCA/APCA family of methods</atitle><jtitle>Journal of chemometrics</jtitle><date>2023-07</date><risdate>2023</risdate><volume>37</volume><issue>7</issue><epage>n/a</epage><issn>0886-9383</issn><eissn>1099-128X</eissn><abstract>Summary Many modern analytical methods are used to analyze samples issued from an experimental design, for example, in medical, biological, chemical, or agronomic fields. Those methods generate most of the time, highly multivariate data like spectra or images, where the number of variables (descriptor responses) tends to be much larger than the number of experimental units. Therefore, multivariate statistical tools are necessary to identify variables that are consistently affected by experimental factors. In this context, two recent methods combining ANOVA and PCA, namely, ASCA (ANOVA‐Simultaneous Component Analysis) and APCA (ANOVA‐Principal Component Analysis), were developed. They provide powerful visualization tools for multivariate structures in the space of each effect of the statistical model linked to the experimental design. Their main limitation is that they produce biased estimators of the factor effects when the design of experiment is unbalanced. This article presents the R package limpca (for linear models with principal component effects analysis) that implements ASCA+ and APCA+, an enhanced version of ASCA and APCA methods based on several principles from the theory of general linear models (GLM). In this paper, the methodology is reviewed, the package structure and functions are presented, and a metabolomics data set is used to clearly demonstrate the potential of ASCA+ and APCA+ methods to highlight true biomarkers corresponding to effects of interest in unbalanced designs. This article presents the R package limpca that implements ASCA+ and APCA+ methods. Those two methods extend the use of ASCA and APCA to unbalanced designs with several principles from the theory of general linear models. The package structure is presented and applied on metabolomics data to highlight biomarkers corresponding to effects of interest in an experimental design.</abstract><cop>Chichester</cop><pub>Wiley Subscription Services, Inc</pub><doi>10.1002/cem.3482</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0003-4800-0942</orcidid><orcidid>https://orcid.org/0000-0002-0242-9623</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0886-9383
ispartof Journal of chemometrics, 2023-07, Vol.37 (7), p.n/a
issn 0886-9383
1099-128X
language eng
recordid cdi_proquest_journals_2833744237
source Wiley Online Library Journals Frontfile Complete
subjects Analytical methods
ANOVA
APCA
ASCA
Biomarkers
design of experiment
Design of experiments
Experimental design
general linear model
Multivariate analysis
Principal components analysis
Statistical models
Variance analysis
title limpca: An R package for the linear modeling of high‐dimensional designed data based on ASCA/APCA family of methods
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T14%3A42%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=limpca:%20An%20R%20package%20for%20the%20linear%20modeling%20of%20high%E2%80%90dimensional%20designed%20data%20based%20on%20ASCA/APCA%20family%20of%20methods&rft.jtitle=Journal%20of%20chemometrics&rft.au=Thiel,%20Michel&rft.date=2023-07&rft.volume=37&rft.issue=7&rft.epage=n/a&rft.issn=0886-9383&rft.eissn=1099-128X&rft_id=info:doi/10.1002/cem.3482&rft_dat=%3Cproquest_cross%3E2833744237%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2833744237&rft_id=info:pmid/&rfr_iscdi=true