Deep in the Bowel: Highly Interpretable Neural Encoder-Decoder Networks Predict Gut Metabolites from Gut Microbiome

Technological advances in next-generation sequencing (NGS) and chromatographic assays [e.g., liquid chromatography mass spectrometry (LC-MS)] have made it possible to identify thousands of microbe and metabolite species, and to measure their relative abundance. In this paper, we propose a sparse neu...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	BMC genomics 2020-07, Vol.21 (Suppl 4), p.256-256, Article 256
Hauptverfasser:	Le, Vuong, Quinn, Thomas P, Tran, Truyen, Venkatesh, Svetha
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Bacteria Biomarkers Coders Data analysis Data processing Datasets Deep learning Digestive system Ecosystem biology Encoders-Decoders Gastrointestinal Microbiome Gene expression Genetic aspects Genomics Graph theory Humans Inflammation Inflammatory bowel disease Inflammatory bowel diseases Inflammatory Bowel Diseases - metabolism Inflammatory Bowel Diseases - microbiology Interpretability Intestinal microflora Intestine Liquid chromatography Machine learning Mass spectrometry Mass spectroscopy Metabolism Metabolites Metabolome Metabolomics Microbiomes Microbiota Microbiota (Symbiotic organisms) Microorganisms Models, Statistical Multi-omics Neural coding Neural networks Neural Networks, Computer Next-generation sequencing Physiological aspects Relative abundance Workflow
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Technological advances in next-generation sequencing (NGS) and chromatographic assays [e.g., liquid chromatography mass spectrometry (LC-MS)] have made it possible to identify thousands of microbe and metabolite species, and to measure their relative abundance. In this paper, we propose a sparse neural encoder-decoder network to predict metabolite abundances from microbe abundances. Using paired data from a cohort of inflammatory bowel disease (IBD) patients, we show that our neural encoder-decoder model outperforms linear univariate and multivariate methods in terms of accuracy, sparsity, and stability. Importantly, we show that our neural encoder-decoder model is not simply a black box designed to maximize predictive accuracy. Rather, the network's hidden layer (i.e., the latent space, comprised only of sparsely weighted microbe counts) actually captures key microbe-metabolite relationships that are themselves clinically meaningful. Although this hidden layer is learned without any knowledge of the patient's diagnosis, we show that the learned latent features are structured in a way that predicts IBD and treatment status with high accuracy. By imposing a non-negative weights constraint, the network becomes a directed graph where each downstream node is interpretable as the additive combination of the upstream nodes. Here, the middle layer comprises distinct microbe-metabolite axes that relate key microbial biomarkers with metabolite biomarkers. By pre-processing the microbiome and metabolome data using compositional data analysis methods, we ensure that our proposed multi-omics workflow will generalize to any pair of -omics data. To the best of our knowledge, this work is the first application of neural encoder-decoders for the interpretable integration of multi-omics biological data.
ISSN:	1471-2164 1471-2164
DOI:	10.1186/s12864-020-6652-7