Multi-part strategy for testing differential taxa abundance in sequencing data: A simulation study with an application to a microbiome study
Comparing the microbiome across study arms is a recurrent goal in many studies. Standard statistical methods are often used for this purpose, however, they do not always represent the best choice in this context given the characteristics of microbiota sequencing data, e.g., non-negative, highly skew...
Gespeichert in:
Veröffentlicht in: | Journal of microbiological methods 2023-09, Vol.212, p.106810-106810, Article 106810 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Comparing the microbiome across study arms is a recurrent goal in many studies. Standard statistical methods are often used for this purpose, however, they do not always represent the best choice in this context given the characteristics of microbiota sequencing data, e.g., non-negative, highly skewed counts with a large number of zeros.
A multi-part strategy, that combines a two-part test (as described by Wagner et al., 2011), a Wilcoxon sum-rank test, a Chi-square and a Barnard's test was explored to compare the taxa abundance between study arms. The choice of the test is based on the data structure. The type I error of the multi-part strategy was evaluated by using a simulation study and the method was applied to real data. The script to perform the analysis with the multi-part approach is provided in the statistical software SAS.
Several scenarios were simulated and in all of them the type I error was not inflated. Based on the statistical differences resulting from the two-part test (as described by Wagner et al., 2011) and the multi-part strategy (as proposed in this article), different biological implications can be extracted from the same comparison in the same data set.
In the comparison of taxa abundance between study arms, we showed that careful attention needs to be paid on the data structure, in order to be able to choose an appropriate analysis method. Our approach selects the most suitable test according to the type of data observed, maintains a good type I error and is easily applicable by using the SAS macro provided.
•A multi-part strategy is proposed to compare the taxa abundance between study arms.•The type I error of the multi-part strategy in the simulated scenarios was not inflated.•The method was applied to real data.•When comparing taxa abundance between arms, the data structure should be considered.•The SAS program to perform the analysis with the multi-part approach is provided. |
---|---|
ISSN: | 0167-7012 1872-8359 |
DOI: | 10.1016/j.mimet.2023.106810 |