Scalable log-ratio lasso regression for enhanced microbial feature selection with FLORAL

Identifying predictive biomarkers of patient outcomes from high-throughput microbiome data is of high interest, while existing computational methods do not satisfactorily account for complex survival endpoints, longitudinal samples, and taxa-specific sequencing biases. We present FLORAL, an open-sou...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Cell reports methods 2024-11, Vol.4 (11), p.100899, Article 100899
Hauptverfasser: Fei, Teng, Funnell, Tyler, Waters, Nicholas R., Raj, Sandeep S., Baichoo, Mirae, Sadeghi, Keimya, Dai, Anqi, Miltiadous, Oriana, Shouval, Roni, Lv, Meng, Peled, Jonathan U., Ponce, Doris M., Perales, Miguel-Angel, Gönen, Mithat, van den Brink, Marcel R.M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Identifying predictive biomarkers of patient outcomes from high-throughput microbiome data is of high interest, while existing computational methods do not satisfactorily account for complex survival endpoints, longitudinal samples, and taxa-specific sequencing biases. We present FLORAL, an open-source tool to perform scalable log-ratio lasso regression and microbial feature selection for continuous, binary, time-to-event, and competing risk outcomes, with compatibility for longitudinal microbiome data as time-dependent covariates. The proposed method adapts the augmented Lagrangian algorithm for a zero-sum constraint optimization problem while enabling a two-stage screening process for enhanced false-positive control. In extensive simulation and real-data analyses, FLORAL achieved consistently better false-positive control compared to other lasso-based approaches and better sensitivity over popular differential abundance testing methods for datasets with smaller sample sizes. In a survival analysis of allogeneic hematopoietic cell transplant recipients, FLORAL demonstrated considerable improvement in microbial feature selection by utilizing longitudinal microbiome data over solely using baseline microbiome data. [Display omitted] •FLORAL correlates microbial features with continuous, binary, or survival outcomes•FLORAL utilizes longitudinal data to improve feature selection in survival models•False discoveries are controlled by FLORAL’s two-step selection procedure•FLORAL identifies meaningful microbial markers in allo-HCTs Microbial biomarker identification has become a key application for variable selection methods, yet real-world studies present new challenges in linking high-dimensional longitudinal microbiome data with complex time-to-event outcomes. To our knowledge, these challenges have not been sufficiently addressed in the literature. The nature of survival endpoints complicates the definition of patient groups, which is necessary for direct comparisons of longitudinal trajectories via differential abundance testing methods. Additionally, existing log-ratio lasso regression methods have not been systematically extended to Cox and Fine-Gray models, particularly with respect to incorporating longitudinal microbial features. Fei et al. develop a computational tool, FLORAL, that correlates high-dimensional microbial features with continuous, binary, or survival outcomes. FLORAL incorporates longitudinal microbiome data in survival regression models,
ISSN:2667-2375
2667-2375
DOI:10.1016/j.crmeth.2024.100899