Enhanced Stroke Risk Prediction: A Fusion of Machine Learning Models for Improved Healthcare Strategies
Stroke is a serious medical condition that can result in death as it causes a sudden loss of blood supply to large portions of brain. Given the rising prevalence of strokes, it is critical to understand the many factors that contribute to these occurrences. A strong prediction framework must be deve...
Gespeichert in:
Veröffentlicht in: | SN computer science 2024-11, Vol.5 (8), p.1078, Article 1078 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Stroke is a serious medical condition that can result in death as it causes a sudden loss of blood supply to large portions of brain. Given the rising prevalence of strokes, it is critical to understand the many factors that contribute to these occurrences. A strong prediction framework must be developed to identify a person's risk for stroke. The effectiveness of several machine learning (ML) techniques, such as Decision Trees (DT), Extra Trees (ET), Random Forest (RF), and Voting Classifiers (VC), in predicting the risk of stroke is being investigated. Furthermore, this research clarifies that whereas certain factors—like age, gender, and smoking status—have a big impact, others—like place of residence—have little effect and may be controlled using careful feature selection methods. Principal component analysis (PCA) is an approach for reducing dimensionality that is particularly effective when combined with class-balancing methods such as Synthetic Minority Oversampling Technique (SMOTE), which is required for dealing with unbalanced datasets, such as those with only 5% of cases indicating stroke risk and 95% representing non-stroke cases. The SMOTE oversampling approach, which involves replicating nearby samples, is used to correct this skew. We examine each algorithm's Receiver Operating Characteristic (ROC) scores; we find that ET, RF, and VC have areas under the curve that are larger than 0.95. After a thorough analysis that considers many performance criteria such as recall, accuracy, F1 score, and precision, the Voting Ensemble approach is found to be a better option than the current stroke detection methods. Interestingly, hypertension is identified as a key risk factor, with most hypertensive persons being at risk for stroke. There is a strong correlation between cardiovascular disease and stroke, with most stroke cases occurring in people who already have a heart issue. It is noteworthy that whilst 5% of people with heart illness get strokes, 95% of those without cardiac conditions never have a stroke. |
---|---|
ISSN: | 2661-8907 2662-995X 2661-8907 |
DOI: | 10.1007/s42979-024-03389-w |