Development of an interpretable machine learning model associated with heavy metals’ exposure to identify coronary heart disease among US adults via SHAP: Findings of the US NHANES from 2003 to 2018

Limited information is available on the links between heavy metals' exposure and coronary heart disease (CHD). We aim to establish an efficient and explainable machine learning (ML) model that associates heavy metals' exposure with CHD identification. Our datasets for investigating the ass...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Chemosphere (Oxford) 2023-01, Vol.311, p.137039-137039, Article 137039
Hauptverfasser: Li, Xi, Zhao, Yang, Zhang, Dongdong, Kuang, Lei, Huang, Hao, Chen, Weiling, Fu, Xueru, Wu, Yuying, Li, Tianze, Zhang, Jinli, Yuan, Lijun, Hu, Huifang, Liu, Yu, Zhang, Ming, Hu, Fulan, Sun, Xizhuo, Hu, Dongsheng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Limited information is available on the links between heavy metals' exposure and coronary heart disease (CHD). We aim to establish an efficient and explainable machine learning (ML) model that associates heavy metals' exposure with CHD identification. Our datasets for investigating the associations between heavy metals and CHD were sourced from the US National Health and Nutrition Examination Survey (US NHANES, 2003–2018). Five ML models were established to identify CHD by heavy metals' exposure. Further, 11 discrimination characteristics were used to test the strength of the models. The optimally performing model was selected for identification. Finally, the SHapley Additive exPlanations (SHAP) tool was used for interpreting the features to visualize the selected model's decision-making capacity. In total, 12,554 participants were eligible for this study. The best performing random forest classifier (RF) based on 13 heavy metals to identify CHD was chosen (AUC: 0.827; 95%CI: 0.777–0.877; accuracy: 95.9%). SHAP values indicated that cesium (1.62), thallium (1.17), antimony (1.63), dimethylarsonic acid (0.91), barium (0.76), arsenous acid (0.79), total arsenic (0.01) in urine, and lead (3.58) and cadmium (4.66) in blood positively contributed to the model, while cobalt (−0.15), cadmium (−2.93), and uranium (−0.13) in urine negatively contributed to the model. The RF model was efficient, accurate, and robust in identifying an association between heavy metals' exposure and CHD among US NHANES 2003–2018 participants. Cesium, thallium, antimony, dimethylarsonic acid, barium, arsenous acid, and total arsenic in urine, and lead and cadmium in blood show positive relationships with CHD, while cobalt, cadmium, and uranium in urine show negative relationships with CHD. [Display omitted] •An explainable ML model for CHD identification was built by using multi-source data over 16 years.•RF model associated with heavy metal for identifying CHD was efficient, accurate, and robust.•Shap explained positive and negative relationships between heavy metals with CHD.
ISSN:0045-6535
1879-1298
DOI:10.1016/j.chemosphere.2022.137039