Predicting quorum sensing peptides using stacked generalization ensemble with gradient boosting based feature selection
Bacteria exist in natural environments for most of their life as complex, heterogeneous, and multicellular aggregates. Under these circumstances, critical cell functions are controlled by several signaling molecules known as quorum sensing (QS) molecules. In Gram-positive bacteria, peptides are depl...
Gespeichert in:
Veröffentlicht in: | The journal of microbiology 2022, 60(7), , pp.756-765 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Bacteria exist in natural environments for most of their life as complex, heterogeneous, and multicellular aggregates. Under these circumstances, critical cell functions are controlled by several signaling molecules known as quorum sensing (QS) molecules. In Gram-positive bacteria, peptides are deployed as QS molecules. The development of antibodies against such QS molecules has been identified as a promising therapeutic intervention for bacterial control. Hence, the identification of QS peptides has received considerable attention. Availability of a fast and reliable predictive model to effectively identify QS peptides can help the existing high throughput experiments. In this study, a stacked generalization ensemble model with Gradient Boosting Machine (GBM)-based feature selection, namely EnsembleQS was developed to predict QS peptides with high accuracy. On selected GBM features (791D), the EnsembleQS outperformed finely tuned baseline classifiers and demonstrated robust performance, indicating the superiority of the model. The accuracy of EnsembleQS is 4% higher than those resulting from ensemble model on hybrid dataset. When evaluating an independent data set of 40 QS peptides, the EnsembleQS model showed an accuracy of 93.4% with Matthew’s Correlation Coefficient (MCC) and area under the ROC curve (AUC) values of 0.91 and 0.951, respectively. These results suggest that EnsembleQS will be a useful computational framework for predicting QS peptides and will efficiently support proteomics research. The source code and all datasets used in this study are publicly available at
https://github.com/proteinexplorers/EnsembleQS
. |
---|---|
ISSN: | 1976-3794 1225-8873 1976-3794 |
DOI: | 10.1007/s12275-022-2044-9 |