A novel RFE-GRU model for diabetes classification using PIMA Indian dataset

Diabetes is a long-term condition characterized by elevated blood sugar levels. It can lead to a variety of complex disorders such as stroke, renal failure, and heart attack. Diabetes requires the most machine learning help to diagnose diabetes illness at an early stage, as it cannot be treated and...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Scientific reports 2025-01, Vol.15 (1), p.982-22, Article 982
Hauptverfasser:	Shams, Mahmoud Y., Tarek, Zahraa, Elshewey, Ahmed M.
Format:	Artikel
Sprache:	eng
Schlagworte:	639/705 692/308 692/700 Algorithms Bayes Theorem Blood levels Classification Databases, Factual Datasets Diabetes Diabetes classification Diabetes mellitus Diabetes Mellitus - classification Diabetes Mellitus - diagnosis Diabetes Mellitus - epidemiology Gated recurrent unit (GRU) Humanities and Social Sciences Humans KNN Learning algorithms Logistic Models Machine Learning multidisciplinary Myocardial infarction Pima People Recursive feature elimination (RFE) Renal failure Science Science (multidisciplinary)
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Diabetes is a long-term condition characterized by elevated blood sugar levels. It can lead to a variety of complex disorders such as stroke, renal failure, and heart attack. Diabetes requires the most machine learning help to diagnose diabetes illness at an early stage, as it cannot be treated and adds significant complications to our health-care system. The diabetes PIMA Indian dataset (PIDD) was used for classification in several studies, it includes 768 instances and 9 features; eight of the features are the predictors, and one feature is the target. Firstly, we performed the preprocessing stage that includes mean imputation and data normalization. Afterwards, we trained the extracted features using various types of Machine Learning (ML); Random Forest (RF), Logistic Regression (LR), K-Nearest neighbor (KNN), Naïve Bayes (NB), Histogram Gradient Boost (HGB), and Gated Recurrent Unit (GRU) models. To achieve the classification for the PIDD, a new model called Recursive Feature Elimination-GRU (RFE-GRU) is proposed in this paper. RFE is vital for selecting features in the training dataset that are most important in predicting the target variable. While the GRU handles the challenge of vanishing and inflating gradient of the features results from RFE. Several predictive evaluation metrics, including precision, recall, F1-score, accuracy, and Area Under the Curve (AUC) achieved 90.50%, 90.70%, 90.50%, 90.70%, 0.9278, respectively, to verify and validate the execution of the RFE-GRU model. The comparative results showed that the RFE-GRU model is better than other classification models.
ISSN:	2045-2322 2045-2322
DOI:	10.1038/s41598-024-82420-9