An optimal normalization method for high sparse compositional microbiome data

In many omics data, including microbiome sequencing data, we are only able to measure relative information. Various computational or statistical methods have been proposed to extract absolute (or biologically relevant) information from this relative information; however, these methods are under rath...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PLoS computational biology 2024-08, Vol.20 (8), p.e1012338
Hauptverfasser: Sohn, Michael B, Monaco, Cynthia, Gill, Steven R
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In many omics data, including microbiome sequencing data, we are only able to measure relative information. Various computational or statistical methods have been proposed to extract absolute (or biologically relevant) information from this relative information; however, these methods are under rather strong assumptions that may not be suitable for multigroup (more than two groups) and/or longitudinal outcome data. In this article, we first introduce the minimal assumption required to extract absolute from relative information. This assumption is less stringent than those imposed in existing methods, thus being applicable to multigroup and/or longitudinal outcome data. We then propose the first normalization method that works under this minimal assumption. The optimality and validity of the proposed method and its beneficial effects on downstream analysis are demonstrated in extensive simulation studies, where existing methods fail to produce consistent performance under the minimal assumption. We also demonstrate its application to real microbiome datasets to determine biologically relevant microbes to a specific disease/condition.
ISSN:1553-7358
1553-734X
1553-7358
DOI:10.1371/journal.pcbi.1012338