A Genome-Based Model to Predict the Virulence of Pseudomonas aeruginosa Isolates

Variation in the genome of , an important pathogen, can have dramatic impacts on the bacterium's ability to cause disease. We therefore asked whether it was possible to predict the virulence of isolates based on their genomic content. We applied a machine learning approach to a genetically and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:mBio 2020-08, Vol.11 (4)
Hauptverfasser: Pincus, Nathan B, Ozer, Egon A, Allen, Jonathan P, Nguyen, Marcus, Davis, James J, Winter, Deborah R, Chuang, Chih-Hsien, Chiu, Cheng-Hsun, Zamorano, Laura, Oliver, Antonio, Hauser, Alan R
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Variation in the genome of , an important pathogen, can have dramatic impacts on the bacterium's ability to cause disease. We therefore asked whether it was possible to predict the virulence of isolates based on their genomic content. We applied a machine learning approach to a genetically and phenotypically diverse collection of 115 clinical isolates using genomic information and corresponding virulence phenotypes in a mouse model of bacteremia. We defined the accessory genome of these isolates through the presence or absence of accessory genomic elements (AGEs), sequences present in some strains but not others. Machine learning models trained using AGEs were predictive of virulence, with a mean nested cross-validation accuracy of 75% using the random forest algorithm. However, individual AGEs did not have a large influence on the algorithm's performance, suggesting instead that virulence predictions are derived from a diffuse genomic signature. These results were validated with an independent test set of 25 isolates whose virulence was predicted with 72% accuracy. Machine learning models trained using core genome single-nucleotide variants and whole-genome k-mers also predicted virulence. Our findings are a proof of concept for the use of bacterial genomes to predict pathogenicity in and highlight the potential of this approach for predicting patient outcomes. is a clinically important Gram-negative opportunistic pathogen. shows a large degree of genomic heterogeneity both through variation in sequences found throughout the species (core genome) and through the presence or absence of sequences in different isolates (accessory genome). isolates also differ markedly in their ability to cause disease. In this study, we used machine learning to predict the virulence level of isolates in a mouse bacteremia model based on genomic content. We show that both the accessory and core genomes are predictive of virulence. This study provides a machine learning framework to investigate relationships between bacterial genomes and complex phenotypes such as virulence.
ISSN:2161-2129
2150-7511
2150-7511
DOI:10.1128/mBio.01527-20