Enhanced Stroke Risk Prediction: A Fusion of Machine Learning Models for Improved Healthcare Strategies

Stroke is a serious medical condition that can result in death as it causes a sudden loss of blood supply to large portions of brain. Given the rising prevalence of strokes, it is critical to understand the many factors that contribute to these occurrences. A strong prediction framework must be deve...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:SN computer science 2024-11, Vol.5 (8), p.1078, Article 1078
Hauptverfasser: Ahmed, Rafeeq, Varshney, Anmol, Ashraf, Zubair, Farooqui, Nafees Akhter, Pathak, Ravi Shanker
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Stroke is a serious medical condition that can result in death as it causes a sudden loss of blood supply to large portions of brain. Given the rising prevalence of strokes, it is critical to understand the many factors that contribute to these occurrences. A strong prediction framework must be developed to identify a person's risk for stroke. The effectiveness of several machine learning (ML) techniques, such as Decision Trees (DT), Extra Trees (ET), Random Forest (RF), and Voting Classifiers (VC), in predicting the risk of stroke is being investigated. Furthermore, this research clarifies that whereas certain factors—like age, gender, and smoking status—have a big impact, others—like place of residence—have little effect and may be controlled using careful feature selection methods. Principal component analysis (PCA) is an approach for reducing dimensionality that is particularly effective when combined with class-balancing methods such as Synthetic Minority Oversampling Technique (SMOTE), which is required for dealing with unbalanced datasets, such as those with only 5% of cases indicating stroke risk and 95% representing non-stroke cases. The SMOTE oversampling approach, which involves replicating nearby samples, is used to correct this skew. We examine each algorithm's Receiver Operating Characteristic (ROC) scores; we find that ET, RF, and VC have areas under the curve that are larger than 0.95. After a thorough analysis that considers many performance criteria such as recall, accuracy, F1 score, and precision, the Voting Ensemble approach is found to be a better option than the current stroke detection methods. Interestingly, hypertension is identified as a key risk factor, with most hypertensive persons being at risk for stroke. There is a strong correlation between cardiovascular disease and stroke, with most stroke cases occurring in people who already have a heart issue. It is noteworthy that whilst 5% of people with heart illness get strokes, 95% of those without cardiac conditions never have a stroke.
ISSN:2661-8907
2662-995X
2661-8907
DOI:10.1007/s42979-024-03389-w