Using Machine Learning Algorithm as a Method for Improving Stroke Prediction

Having sudden strokes has had a very negative impact on all aspects in society to the point that it attracted efforts for better improvement and management of stroke diagnosis. Technological advancement also had an impact on the medical field such that nowadays caregivers have better options for tak...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of advanced computer science & applications 2023, Vol.14 (4)
Hauptverfasser:	Alageel, Nojood, Alharbi, Rahaf, Alharbi, Rehab, Alsayil, Maryam, Alharbi, Lubna A.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Datasets Decision trees Electronic health records Heart diseases Hypertension Machine learning Principal components analysis Recall Statistical methods Stroke
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	4
container_start_page
container_title	International journal of advanced computer science & applications
container_volume	14
creator	Alageel, Nojood Alharbi, Rahaf Alharbi, Rehab Alsayil, Maryam Alharbi, Lubna A.
description	Having sudden strokes has had a very negative impact on all aspects in society to the point that it attracted efforts for better improvement and management of stroke diagnosis. Technological advancement also had an impact on the medical field such that nowadays caregivers have better options for taking care of their patients by mining and archiving their medical records for ease of retrieval. Furthermore, it is quite essential to understand the risk factors that make a patient more susceptible to strokes, thus there are some factors that make stroke prediction much easier. This research offers an analysis of the factors that enhance the stroke prediction process based on electronic health records. The most important factors for stroke prediction will be identified using statistical methods and Principal Component Analysis (PCA). It has been found that the most critical factors affecting stroke prediction are the age, average glucose level, heart disease, and hypertension. A balanced dataset is used for the model evaluation which was created by sub-sampling since the dataset for stroke occurrence is already highly imbalanced. In this study, seven different machine learning algorithms are implemented: Naïve Bayes, SVM, Random Forest, KNN, Decision Tree, Stacking, and majority voting to train on the Kaggle dataset to predict occurrence of stroke in patients. After preprocessing and splitting the dataset into training and testing sub-datasets, these proposed algorithms were evaluated according to accuracy, f1 score, recall value, and precision value. The NB classifier achieved the lowest accuracy level (86%), whereas the rest of the algorithms achieved similar accuracies 96%, f1 scores 0.98, precision 0.97, and recall 1.
doi_str_mv	10.14569/IJACSA.2023.0140481
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2819915856</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2819915856</sourcerecordid><originalsourceid>FETCH-LOGICAL-c274t-10df3b251caf92bcf94803cca85e4faa2669c3063cd67b6f0cebb3e2e2bb03d33</originalsourceid><addsrcrecordid>eNotkMtqwzAQRUVpoSHNH3Qh6Nqp3raWJvTh4tBCGuhOSLKUOE2sVHIK_fs6j9nMcLnM3DkA3GM0xYwL-Vi9lbNFOSWI0CnCDLECX4ERwVxknOfo-jQXGUb51y2YpLRBQ1FJREFHoF6mtlvBubbrtnOwdjp2R6HcrkJs-_UO6gQ1nLt-HRroQ4TVbh_D79Gz6GP4dvAjuqa1fRu6O3Dj9Ta5yaWPwfL56XP2mtXvL9WsrDNLctYPSRpPDeHYai-JsV6yAlFrdcEd81oTIaSlSFDbiNwIj6wzhjriiDGINpSOwcN575Dk5-BSrzbhELvhpCIFlnJ4l4vBxc4uG0NK0Xm1j-1Oxz-FkTqhU2d06ohOXdDRf4l0Ye4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2819915856</pqid></control><display><type>article</type><title>Using Machine Learning Algorithm as a Method for Improving Stroke Prediction</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Alageel, Nojood ; Alharbi, Rahaf ; Alharbi, Rehab ; Alsayil, Maryam ; Alharbi, Lubna A.</creator><creatorcontrib>Alageel, Nojood ; Alharbi, Rahaf ; Alharbi, Rehab ; Alsayil, Maryam ; Alharbi, Lubna A.</creatorcontrib><description>Having sudden strokes has had a very negative impact on all aspects in society to the point that it attracted efforts for better improvement and management of stroke diagnosis. Technological advancement also had an impact on the medical field such that nowadays caregivers have better options for taking care of their patients by mining and archiving their medical records for ease of retrieval. Furthermore, it is quite essential to understand the risk factors that make a patient more susceptible to strokes, thus there are some factors that make stroke prediction much easier. This research offers an analysis of the factors that enhance the stroke prediction process based on electronic health records. The most important factors for stroke prediction will be identified using statistical methods and Principal Component Analysis (PCA). It has been found that the most critical factors affecting stroke prediction are the age, average glucose level, heart disease, and hypertension. A balanced dataset is used for the model evaluation which was created by sub-sampling since the dataset for stroke occurrence is already highly imbalanced. In this study, seven different machine learning algorithms are implemented: Naïve Bayes, SVM, Random Forest, KNN, Decision Tree, Stacking, and majority voting to train on the Kaggle dataset to predict occurrence of stroke in patients. After preprocessing and splitting the dataset into training and testing sub-datasets, these proposed algorithms were evaluated according to accuracy, f1 score, recall value, and precision value. The NB classifier achieved the lowest accuracy level (86%), whereas the rest of the algorithms achieved similar accuracies 96%, f1 scores 0.98, precision 0.97, and recall 1.</description><identifier>ISSN: 2158-107X</identifier><identifier>EISSN: 2156-5570</identifier><identifier>DOI: 10.14569/IJACSA.2023.0140481</identifier><language>eng</language><publisher>West Yorkshire: Science and Information (SAI) Organization Limited</publisher><subject>Algorithms ; Datasets ; Decision trees ; Electronic health records ; Heart diseases ; Hypertension ; Machine learning ; Principal components analysis ; Recall ; Statistical methods ; Stroke</subject><ispartof>International journal of advanced computer science & applications, 2023, Vol.14 (4)</ispartof><rights>2023. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,4024,27923,27924,27925</link.rule.ids></links><search><creatorcontrib>Alageel, Nojood</creatorcontrib><creatorcontrib>Alharbi, Rahaf</creatorcontrib><creatorcontrib>Alharbi, Rehab</creatorcontrib><creatorcontrib>Alsayil, Maryam</creatorcontrib><creatorcontrib>Alharbi, Lubna A.</creatorcontrib><title>Using Machine Learning Algorithm as a Method for Improving Stroke Prediction</title><title>International journal of advanced computer science & applications</title><description>Having sudden strokes has had a very negative impact on all aspects in society to the point that it attracted efforts for better improvement and management of stroke diagnosis. Technological advancement also had an impact on the medical field such that nowadays caregivers have better options for taking care of their patients by mining and archiving their medical records for ease of retrieval. Furthermore, it is quite essential to understand the risk factors that make a patient more susceptible to strokes, thus there are some factors that make stroke prediction much easier. This research offers an analysis of the factors that enhance the stroke prediction process based on electronic health records. The most important factors for stroke prediction will be identified using statistical methods and Principal Component Analysis (PCA). It has been found that the most critical factors affecting stroke prediction are the age, average glucose level, heart disease, and hypertension. A balanced dataset is used for the model evaluation which was created by sub-sampling since the dataset for stroke occurrence is already highly imbalanced. In this study, seven different machine learning algorithms are implemented: Naïve Bayes, SVM, Random Forest, KNN, Decision Tree, Stacking, and majority voting to train on the Kaggle dataset to predict occurrence of stroke in patients. After preprocessing and splitting the dataset into training and testing sub-datasets, these proposed algorithms were evaluated according to accuracy, f1 score, recall value, and precision value. The NB classifier achieved the lowest accuracy level (86%), whereas the rest of the algorithms achieved similar accuracies 96%, f1 scores 0.98, precision 0.97, and recall 1.</description><subject>Algorithms</subject><subject>Datasets</subject><subject>Decision trees</subject><subject>Electronic health records</subject><subject>Heart diseases</subject><subject>Hypertension</subject><subject>Machine learning</subject><subject>Principal components analysis</subject><subject>Recall</subject><subject>Statistical methods</subject><subject>Stroke</subject><issn>2158-107X</issn><issn>2156-5570</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNotkMtqwzAQRUVpoSHNH3Qh6Nqp3raWJvTh4tBCGuhOSLKUOE2sVHIK_fs6j9nMcLnM3DkA3GM0xYwL-Vi9lbNFOSWI0CnCDLECX4ERwVxknOfo-jQXGUb51y2YpLRBQ1FJREFHoF6mtlvBubbrtnOwdjp2R6HcrkJs-_UO6gQ1nLt-HRroQ4TVbh_D79Gz6GP4dvAjuqa1fRu6O3Dj9Ta5yaWPwfL56XP2mtXvL9WsrDNLctYPSRpPDeHYai-JsV6yAlFrdcEd81oTIaSlSFDbiNwIj6wzhjriiDGINpSOwcN575Dk5-BSrzbhELvhpCIFlnJ4l4vBxc4uG0NK0Xm1j-1Oxz-FkTqhU2d06ohOXdDRf4l0Ye4</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Alageel, Nojood</creator><creator>Alharbi, Rahaf</creator><creator>Alharbi, Rehab</creator><creator>Alsayil, Maryam</creator><creator>Alharbi, Lubna A.</creator><general>Science and Information (SAI) Organization Limited</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7XB</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope></search><sort><creationdate>2023</creationdate><title>Using Machine Learning Algorithm as a Method for Improving Stroke Prediction</title><author>Alageel, Nojood ; Alharbi, Rahaf ; Alharbi, Rehab ; Alsayil, Maryam ; Alharbi, Lubna A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c274t-10df3b251caf92bcf94803cca85e4faa2669c3063cd67b6f0cebb3e2e2bb03d33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Datasets</topic><topic>Decision trees</topic><topic>Electronic health records</topic><topic>Heart diseases</topic><topic>Hypertension</topic><topic>Machine learning</topic><topic>Principal components analysis</topic><topic>Recall</topic><topic>Statistical methods</topic><topic>Stroke</topic><toplevel>online_resources</toplevel><creatorcontrib>Alageel, Nojood</creatorcontrib><creatorcontrib>Alharbi, Rahaf</creatorcontrib><creatorcontrib>Alharbi, Rehab</creatorcontrib><creatorcontrib>Alsayil, Maryam</creatorcontrib><creatorcontrib>Alharbi, Lubna A.</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Research Library</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>International journal of advanced computer science & applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alageel, Nojood</au><au>Alharbi, Rahaf</au><au>Alharbi, Rehab</au><au>Alsayil, Maryam</au><au>Alharbi, Lubna A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Using Machine Learning Algorithm as a Method for Improving Stroke Prediction</atitle><jtitle>International journal of advanced computer science & applications</jtitle><date>2023</date><risdate>2023</risdate><volume>14</volume><issue>4</issue><issn>2158-107X</issn><eissn>2156-5570</eissn><abstract>Having sudden strokes has had a very negative impact on all aspects in society to the point that it attracted efforts for better improvement and management of stroke diagnosis. Technological advancement also had an impact on the medical field such that nowadays caregivers have better options for taking care of their patients by mining and archiving their medical records for ease of retrieval. Furthermore, it is quite essential to understand the risk factors that make a patient more susceptible to strokes, thus there are some factors that make stroke prediction much easier. This research offers an analysis of the factors that enhance the stroke prediction process based on electronic health records. The most important factors for stroke prediction will be identified using statistical methods and Principal Component Analysis (PCA). It has been found that the most critical factors affecting stroke prediction are the age, average glucose level, heart disease, and hypertension. A balanced dataset is used for the model evaluation which was created by sub-sampling since the dataset for stroke occurrence is already highly imbalanced. In this study, seven different machine learning algorithms are implemented: Naïve Bayes, SVM, Random Forest, KNN, Decision Tree, Stacking, and majority voting to train on the Kaggle dataset to predict occurrence of stroke in patients. After preprocessing and splitting the dataset into training and testing sub-datasets, these proposed algorithms were evaluated according to accuracy, f1 score, recall value, and precision value. The NB classifier achieved the lowest accuracy level (86%), whereas the rest of the algorithms achieved similar accuracies 96%, f1 scores 0.98, precision 0.97, and recall 1.</abstract><cop>West Yorkshire</cop><pub>Science and Information (SAI) Organization Limited</pub><doi>10.14569/IJACSA.2023.0140481</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2158-107X
ispartof	International journal of advanced computer science & applications, 2023, Vol.14 (4)
issn	2158-107X 2156-5570
language	eng
recordid	cdi_proquest_journals_2819915856
source	EZB-FREE-00999 freely available EZB journals
subjects	Algorithms Datasets Decision trees Electronic health records Heart diseases Hypertension Machine learning Principal components analysis Recall Statistical methods Stroke
title	Using Machine Learning Algorithm as a Method for Improving Stroke Prediction
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T12%3A46%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Using%20Machine%20Learning%20Algorithm%20as%20a%20Method%20for%20Improving%20Stroke%20Prediction&rft.jtitle=International%20journal%20of%20advanced%20computer%20science%20&%20applications&rft.au=Alageel,%20Nojood&rft.date=2023&rft.volume=14&rft.issue=4&rft.issn=2158-107X&rft.eissn=2156-5570&rft_id=info:doi/10.14569/IJACSA.2023.0140481&rft_dat=%3Cproquest_cross%3E2819915856%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2819915856&rft_id=info:pmid/&rfr_iscdi=true