Prediction and interpretation of antibiotic-resistance genes occurrence at recreational beaches using machine learning models

Antibiotic-resistant bacteria and antibiotic resistance genes (ARGs) are pollutants of worldwide concern that seriously threaten public health and ecosystems. Machine learning (ML) prediction models have been applied to predict ARGs in beach waters. However, the existing studies were conducted at a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of environmental management 2023-02, Vol.328, p.116969, Article 116969
Hauptverfasser: Iftikhar, Sara, Karim, Asad Mustafa, Karim, Aoun Murtaza, Karim, Mujahid Aizaz, Aslam, Muhammad, Rubab, Fazila, Malik, Sumera Kausar, Kwon, Jeong Eun, Hussain, Imran, Azhar, Esam I., Kang, Se Chan, Yasir, Muhammad
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Antibiotic-resistant bacteria and antibiotic resistance genes (ARGs) are pollutants of worldwide concern that seriously threaten public health and ecosystems. Machine learning (ML) prediction models have been applied to predict ARGs in beach waters. However, the existing studies were conducted at a single location and had low prediction performance. Moreover, ML models are “black boxes” that do not reveal their predictions' internal nuances and mechanisms. This lack of transparency and trust can result in serious consequences when using these models in high-stakes decisions. In this study, we developed a gradient boosted regression tree based (GBRT) ML model and then described its behavior using six explainable artificial intelligence (XAI) model-agnostic explanation methods. We used hydro-meteorological and qPCR data from the beaches in South Korea and Pakistan and developed ML prediction models for aac (6′-lb-cr), sul1, and tetX with 10-fold time-blocked cross-validation performances of 4.9, 2.06 and 4.4 root mean squared logarithmic error, respectively. We then analyzed the local and global behavior of the developed ML model using four interpretation methods. The developed ML models showed that water temperature, precipitation and tide are the most important predictors for prediction of ARGs at recreational beaches. We show that the model-agnostic interpretation methods not only explain the behavior of the ML model but also provide insights into the behavior of the ML model under new unseen conditions. Moreover, these post-processing techniques can be a debugging tool for ML-based modeling. [Display omitted] •ARGs were detected by qPCR at recreational beaches.•ML models were developed with high prediction performance to predict ARGs.•Explainable Artificial Intelligence explained black-box models' behavior for ARGs.•Water temperature, precipitation, and tide greatly affected ARGs abundance.•Presented post-processing techniques can be a debugging tool for ML modeling.
ISSN:0301-4797
1095-8630
DOI:10.1016/j.jenvman.2022.116969