limpca: An R package for the linear modeling of high‐dimensional designed data based on ASCA/APCA family of methods

Summary Many modern analytical methods are used to analyze samples issued from an experimental design, for example, in medical, biological, chemical, or agronomic fields. Those methods generate most of the time, highly multivariate data like spectra or images, where the number of variables (descript...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of chemometrics 2023-07, Vol.37 (7), p.n/a
Hauptverfasser: Thiel, Michel, Benaiche, Nadia, Martin, Manon, Franceschini, Sébastien, Van Oirbeek, Robin, Govaerts, Bernadette
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Summary Many modern analytical methods are used to analyze samples issued from an experimental design, for example, in medical, biological, chemical, or agronomic fields. Those methods generate most of the time, highly multivariate data like spectra or images, where the number of variables (descriptor responses) tends to be much larger than the number of experimental units. Therefore, multivariate statistical tools are necessary to identify variables that are consistently affected by experimental factors. In this context, two recent methods combining ANOVA and PCA, namely, ASCA (ANOVA‐Simultaneous Component Analysis) and APCA (ANOVA‐Principal Component Analysis), were developed. They provide powerful visualization tools for multivariate structures in the space of each effect of the statistical model linked to the experimental design. Their main limitation is that they produce biased estimators of the factor effects when the design of experiment is unbalanced. This article presents the R package limpca (for linear models with principal component effects analysis) that implements ASCA+ and APCA+, an enhanced version of ASCA and APCA methods based on several principles from the theory of general linear models (GLM). In this paper, the methodology is reviewed, the package structure and functions are presented, and a metabolomics data set is used to clearly demonstrate the potential of ASCA+ and APCA+ methods to highlight true biomarkers corresponding to effects of interest in unbalanced designs. This article presents the R package limpca that implements ASCA+ and APCA+ methods. Those two methods extend the use of ASCA and APCA to unbalanced designs with several principles from the theory of general linear models. The package structure is presented and applied on metabolomics data to highlight biomarkers corresponding to effects of interest in an experimental design.
ISSN:0886-9383
1099-128X
DOI:10.1002/cem.3482