Prediction of genome-wide imipenem resistance features in Klebsiella pneumoniae using machine learning

The resistance rate of ( ) to imipenem is increasing year by year, and the imipenem resistance mechanism of is complex. Therefore, it is urgent to develop new strategies to explore the resistance mechanism of imipenem for its effective and accurate use in clinical practice. Machine learning could id...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of medical microbiology 2023-02, Vol.72 (2)
Hauptverfasser: Li, Shanshan, Wu, Jun, Ma, Nan, Liu, Wenjia, Shao, Mengjie, Ying, Nanjiao, Zhu, Lei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The resistance rate of ( ) to imipenem is increasing year by year, and the imipenem resistance mechanism of is complex. Therefore, it is urgent to develop new strategies to explore the resistance mechanism of imipenem for its effective and accurate use in clinical practice. Machine learning could identify resistance features and biological process that influence microbial resistance from whole-genome sequencing (WGS) data. This work aimed to predict imipenem resistance genetic features in from whole-genome -mer features, and analyse their function for understanding its resistance mechanism. This study analysed WGS data of combined with resistance phenotype for imipenem, and established to imipenem genotype-phenotype model to predict resistance features using chi-squared test and random forest. An external clinical dataset was used to verify prediction power of resistance features. The potential genes were identified through alignment the resistance features with the reference genome using blastn, the functions of potential genes were further analysed to explore its resistance-related signalling pathways with GO and KEGG analysis, the resistance sequence patterns were screened using streme software. Finally, the resistance features were combined and modelled through four machine-learning algorithms (logistic regression, SVM, GBDT and XGBoost) to evaluate their phenotype prediction ability. A total of 16 670 imipenem resistance features were predicted from genotype-phenotype model. The 30 potential genes were identified by annotating the resistance features and corresponded to known antibiotic-related genes ( , , , etc.). GO and KEGG pathway analyses indicated the possible association of imipenem resistance with metabolism process and cell membrane. CRYCAGCDN and CGRDAAAN were found from the imipenem resistance features, which were widely presented in the reported β-lactam resistance genes ( , , , etc.), and YCYAGCMCAST with metabolic functions (organic substance metabolic process, nitrogen compound metabolic process and cellular metabolic process) was identified from the top 50 resistance features. The 25 resistance genes in the training dataset included 19 genes in the external dataset, which verified the accuracy of prediction. The area under curve values of logistics regression, SVM, GBDT and XGBoost were 0.965, 0.966, 0.969 and 0.969, respectively, indicating that the imipenem resistance features have a strong prediction power. Machine-learning methods
ISSN:0022-2615
1473-5644
DOI:10.1099/jmm.0.001657