Mining Hypertension Predictors using Decision Tree: Baseline Data of Kharameh Cohort Study

Introduction: Hypertension is a serious chronic disease and an important risk factor for many health problems. This study aimed to investigate the factors associated with hypertension using a decision-tree algorithm. Methods: Methods: This cross-sectional study was conducted through the census in Kh...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of biostatistics and epidemiology 2024-12, Vol.10 (1)
Hauptverfasser: Rezaianzadeh, Abbas, Nematolahi, Samane, Jalali, Maryam, Rezaeianzadeh, Shayan, Ghoddusi Johari, Masoumeh, Hosseini, Seyed Vahid
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Introduction: Hypertension is a serious chronic disease and an important risk factor for many health problems. This study aimed to investigate the factors associated with hypertension using a decision-tree algorithm. Methods: Methods: This cross-sectional study was conducted through the census in Kharameh City between 2014 and 2017. The study included 2510 hypertensive and 7840 non-hypertensive individuals. To create the decision tree, 70% of the cases were randomly allocated to the training dataset. In comparison, the remaining 30% were used as the testing dataset for the performance evaluation of the decision tree. Two models were assessed. In the first model (model I), 15 variables including age, gender, body mass index (BMI), years of education, occupation status, marital status, family history of hypertension, physical activity, total energy, number of meals, salt, oil type, drug use, alcohol use, and smoke entered into the model. in the second model (model II) 16 variables including age, gender, BMI and Blood factors as Hematocrit (HCT), Mean corpuscular hemoglobin concentration (MCHC), Platelet Count (PLT), Fasting blood sugar (FBS), Blood Urea Nitrogen (BUN), creatinine (Cr), triglycerides (TG), cholesterol (CHOL), Alkaline phosphatase (ALP), High-density lipoprotein (HDL), Gamma-glutamyl transpeptidase (GGT), low-density lipoproteins (LDL) and Urinary specific gravity (SG) were considered. A confusion matrix was used to measure the performance of the decision tree. Additionally, accuracy, sensitivity, specificity, and the receiver operating characteristics (ROC) curve were determined to compare the models. Results: For the model I, the accuracy, sensitivity, specificity and AUC value were 79.2%(77.8-80.6), 82.4%(80.1-84.5), 78.24%(76.4-80), and 0.80%(0.79-0.82), respectively. For model II, the corresponding values were 79.5%(78.2-80.8), 81.0%(78.3-83.6)79.0%(77.5-80.5)and 0.80%(0.79-0.81), respectively. Confusion matrix of model I showed that of the 1188 cases with hypertension in the training data set, 979 cases were classified correctly and, for model II of the 2812 non-hypertension cases, 2222 cases were classified correctly. Conclusion: We have suggested a decision tree model to identify the risk factors associated with hypertension. This model can be useful for early screening and improving preventive and curative health services in health promotion
ISSN:2383-4196
2383-420X
DOI:10.18502/jbe.v10i1.17155