SDA: a semi-parametric differential abundance analysis method for metabolomics and proteomics data

Identifying differentially abundant features between different experimental groups is a common goal for many metabolomics and proteomics studies. However, analyzing data from mass spectrometry (MS) is difficult because the data may not be normally distributed and there is often a large fraction of z...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	BMC bioinformatics 2019-10, Vol.20 (1), p.501-10, Article 501
Hauptverfasser:	Li, Yuntong, Fan, Teresa W M, Lane, Andrew N, Kang, Woo-Young, Arnold, Susanne M, Stromberg, Arnold J, Wang, Chi, Chen, Li
Format:	Artikel
Sprache:	eng
Schlagworte:	Analysis Differential abundance analysis Information management Kernel smoothing Mass spectrometry Mass Spectrometry - methods Metabolomics Metabolomics - methods Methodology Models, Statistical Proteomics Proteomics - methods Semi-parametric log-linear model Software Spectroscopy Statistical methods
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Identifying differentially abundant features between different experimental groups is a common goal for many metabolomics and proteomics studies. However, analyzing data from mass spectrometry (MS) is difficult because the data may not be normally distributed and there is often a large fraction of zero values. Although several statistical methods have been proposed, they either require the data normality assumption or are inefficient. We propose a new semi-parametric differential abundance analysis (SDA) method for metabolomics and proteomics data from MS. The method considers a two-part model, a logistic regression for the zero proportion and a semi-parametric log-linear model for the possibly non-normally distributed non-zero values, to characterize data from each feature. A kernel-smoothed likelihood method is developed to estimate model coefficients and a likelihood ratio test is constructed for differential abundant analysis. The method has been implemented into an R package, SDAMS, which is available at https://www.bioconductor.org/packages/release/bioc/html/SDAMS.html . By introducing the two-part semi-parametric model, SDA is able to handle both non-normally distributed data and large fraction of zero values in a MS dataset. It also allows for adjustment of covariates. Simulations and real data analyses demonstrate that SDA outperforms existing methods.
ISSN:	1471-2105 1471-2105
DOI:	10.1186/s12859-019-3067-z