Comparison of zero replacement strategies for compositional data with large numbers of zeros
Modern applications in chemometrics and bioinformatics result in compositional data sets with a high proportion of zeros. An example are microbiome data, where zeros refer to measurements below the detection limit of one count. When building statistical models, it is important that zeros are replace...
Gespeichert in:
Veröffentlicht in: | Chemometrics and intelligent laboratory systems 2021-03, Vol.210, p.104248, Article 104248 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Modern applications in chemometrics and bioinformatics result in compositional data sets with a high proportion of zeros. An example are microbiome data, where zeros refer to measurements below the detection limit of one count. When building statistical models, it is important that zeros are replaced by sensible values. Different replacement techniques from compositional data analysis are considered and compared by a simulation study and examples. The comparison also includes a recently proposed method (Templ, 2020) [1] based on deep learning. Detailed insights into the appropriateness of the methods for a problem at hand are provided, and differences in the outcomes of statistical results are discussed.
•Analyzing data with high proportions of zeros.•Comparing zero replacement methods for compositional data.•Including a novel method based on deep learning.•Regression analysis with a sparse compositional method.•Application on microbiome data sets. |
---|---|
ISSN: | 0169-7439 1873-3239 |
DOI: | 10.1016/j.chemolab.2021.104248 |