Prediction of bacterial associations with plants using a supervised machine-learning approach

Summary Recent scenarios of fresh produce contamination by human enteric pathogens have resulted in severe food‐borne outbreaks, and a new paradigm has emerged stating that some human‐associated bacteria can use plants as secondary hosts. As a consequence, there has been growing concern in the scien...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Environmental microbiology 2016-12, Vol.18 (12), p.4847-4861
Hauptverfasser: Martínez-García, Pedro Manuel, López-Solanilla, Emilia, Ramos, Cayo, Rodríguez-Palenzuela, Pablo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Summary Recent scenarios of fresh produce contamination by human enteric pathogens have resulted in severe food‐borne outbreaks, and a new paradigm has emerged stating that some human‐associated bacteria can use plants as secondary hosts. As a consequence, there has been growing concern in the scientific community about these interactions that have not yet been elucidated. Since this is a relatively new area, there is a lack of strategies to address the problem of food‐borne illnesses due to the ingestion of fruits and vegetables. In the present study, we performed specific genome annotations to train a supervised machine‐learning model that allows for the identification of plant‐associated bacteria with a precision of ∼93%. The application of our method to approximately 9500 genomes predicted several unknown interactions between well‐known human pathogens and plants, and it also confirmed several cases for which evidence has been reported. We observed that factors involved in adhesion, the deconstruction of the plant cell wall and detoxifying activities were highlighted as the most predictive features. The application of our strategy to sequenced strains that are involved in food poisoning can be used as a primary screening tool to determine the possible causes of contaminations.
ISSN:1462-2912
1462-2920
DOI:10.1111/1462-2920.13389