A stacked ensemble machine learning approach for the prediction of diabetes

Objectives Diabetes has become a leading cause of mortality in both developed and developing countries, impacting a growing number of individuals worldwide. As the prevalence of the disease continues to rise, researchers have diligently worked towards developing accurate diabetes prediction models....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of diabetes and metabolic disorders 2023-11, Vol.23 (1), p.603-617
Hauptverfasser: Oliullah, Khondokar, Rasel, Mahedi Hasan, Islam, Md. Manzurul, Islam, Md. Reazul, Wadud, Md. Anwar Hussen, Whaiduzzaman, Md
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 617
container_issue 1
container_start_page 603
container_title Journal of diabetes and metabolic disorders
container_volume 23
creator Oliullah, Khondokar
Rasel, Mahedi Hasan
Islam, Md. Manzurul
Islam, Md. Reazul
Wadud, Md. Anwar Hussen
Whaiduzzaman, Md
description Objectives Diabetes has become a leading cause of mortality in both developed and developing countries, impacting a growing number of individuals worldwide. As the prevalence of the disease continues to rise, researchers have diligently worked towards developing accurate diabetes prediction models. The primary aim of this study is to utilize a diverse set of machine learning algorithms to detect the presence of diabetes, particularly in females, at an early stage. By leveraging these methods, this research seeks to provide physicians with valuable tools to identify the disease early, enabling timely interventions and improving patient outcomes. Methods In this study, some state-of-the-art machine learning techniques, such as random forest classifiers with gridsearchCV, XGBoost, NGBoost, Bagging, LightGBM, and AdaBoost classifiers, were employed. These models were chosen as the base layer of our proposed stacked ensemble model because of their high accuracy. Before feeding the data into the models, the dataset was preprocessed to ensure optimal performance and obtain improved results. Results The accuracy achieved in this study was 92.91%, which demonstrates its competitiveness with the existing approaches. Moreover, the utilization of the Shapley additive explanation (SHAP) facilitated the interpretation of machine learning models. Conclusion We anticipate that these findings will be beneficial to healthcare providers, stakeholders, students, and researchers involved in diabetes prediction research and development.
doi_str_mv 10.1007/s40200-023-01321-2
format Article
fullrecord <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11196524</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A773877125</galeid><sourcerecordid>A773877125</sourcerecordid><originalsourceid>FETCH-LOGICAL-c524t-ada641bb9fc93c2d1b3d6838142f9719cc03e4cdf9169824eec2d221e1af80bc3</originalsourceid><addsrcrecordid>eNp9ksFu1DAQhiMEolXpC3BAlpAQlxSPncTOCa0qCohKXOBsOc5445LYwU4q8fZ4u6XdRQj7YGvmm9-e0V8UL4FeAKXiXaooo7SkjJcUOIOSPSlOGauhbGoJTw_uJ8V5Sjc0LyGkhOZ5ccJly5ls-GnxZUPSos0P7An6hFM3Ipm0GZxHMqKO3vkt0fMcQw4SGyJZBiRzxN6ZxQVPgiW90x0umF4Uz6weE57fn2fF96sP3y4_lddfP36-3FyXpmbVUupeNxV0XWtNyw3roeN9I7mEitlWQGsM5ViZ3rbQtJJViBliDBC0lbQz_Kx4v9ed127C3qBfoh7VHN2k4y8VtFPHGe8GtQ23CgDaJv8hK7y9V4jh54ppUZNLBsdRewxrUpwKJqHmTGT09V_oTVijz_0pDg2XrM6zfKS2ekTlvA35YbMTVRshuBQCWJ2pi39Qefc4ORM8WpfjRwVvDgoG1OMypDCuu8mnY5DtQRNDShHtwzSAqp1f1N4vKvtF3flFsVz06nCODyV_3JEBvgdSTvktxsfe_yP7G1vcyKk</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3163825893</pqid></control><display><type>article</type><title>A stacked ensemble machine learning approach for the prediction of diabetes</title><source>Springer Nature - Complete Springer Journals</source><source>PubMed Central</source><creator>Oliullah, Khondokar ; Rasel, Mahedi Hasan ; Islam, Md. Manzurul ; Islam, Md. Reazul ; Wadud, Md. Anwar Hussen ; Whaiduzzaman, Md</creator><creatorcontrib>Oliullah, Khondokar ; Rasel, Mahedi Hasan ; Islam, Md. Manzurul ; Islam, Md. Reazul ; Wadud, Md. Anwar Hussen ; Whaiduzzaman, Md</creatorcontrib><description>Objectives Diabetes has become a leading cause of mortality in both developed and developing countries, impacting a growing number of individuals worldwide. As the prevalence of the disease continues to rise, researchers have diligently worked towards developing accurate diabetes prediction models. The primary aim of this study is to utilize a diverse set of machine learning algorithms to detect the presence of diabetes, particularly in females, at an early stage. By leveraging these methods, this research seeks to provide physicians with valuable tools to identify the disease early, enabling timely interventions and improving patient outcomes. Methods In this study, some state-of-the-art machine learning techniques, such as random forest classifiers with gridsearchCV, XGBoost, NGBoost, Bagging, LightGBM, and AdaBoost classifiers, were employed. These models were chosen as the base layer of our proposed stacked ensemble model because of their high accuracy. Before feeding the data into the models, the dataset was preprocessed to ensure optimal performance and obtain improved results. Results The accuracy achieved in this study was 92.91%, which demonstrates its competitiveness with the existing approaches. Moreover, the utilization of the Shapley additive explanation (SHAP) facilitated the interpretation of machine learning models. Conclusion We anticipate that these findings will be beneficial to healthcare providers, stakeholders, students, and researchers involved in diabetes prediction research and development.</description><identifier>ISSN: 2251-6581</identifier><identifier>EISSN: 2251-6581</identifier><identifier>DOI: 10.1007/s40200-023-01321-2</identifier><identifier>PMID: 38932863</identifier><language>eng</language><publisher>Cham: Springer International Publishing</publisher><subject>Algorithms ; Data mining ; Developing countries ; Diabetes ; Endocrinology ; Health care industry ; India ; Machine learning ; Medical research ; Medicine ; Medicine &amp; Public Health ; Medicine, Experimental ; Metabolic Diseases ; Mortality ; R&amp;D ; Research &amp; development ; Research Article</subject><ispartof>Journal of diabetes and metabolic disorders, 2023-11, Vol.23 (1), p.603-617</ispartof><rights>The Author(s), under exclusive licence to Tehran University of Medical Sciences 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><rights>COPYRIGHT 2023 BioMed Central Ltd.</rights><rights>Copyright BioMed Central 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c524t-ada641bb9fc93c2d1b3d6838142f9719cc03e4cdf9169824eec2d221e1af80bc3</cites><orcidid>0009-0003-7481-5982 ; 0009-0001-1973-2928 ; 0000-0003-2822-0657</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11196524/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11196524/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,41464,42533,51294,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38932863$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Oliullah, Khondokar</creatorcontrib><creatorcontrib>Rasel, Mahedi Hasan</creatorcontrib><creatorcontrib>Islam, Md. Manzurul</creatorcontrib><creatorcontrib>Islam, Md. Reazul</creatorcontrib><creatorcontrib>Wadud, Md. Anwar Hussen</creatorcontrib><creatorcontrib>Whaiduzzaman, Md</creatorcontrib><title>A stacked ensemble machine learning approach for the prediction of diabetes</title><title>Journal of diabetes and metabolic disorders</title><addtitle>J Diabetes Metab Disord</addtitle><addtitle>J Diabetes Metab Disord</addtitle><description>Objectives Diabetes has become a leading cause of mortality in both developed and developing countries, impacting a growing number of individuals worldwide. As the prevalence of the disease continues to rise, researchers have diligently worked towards developing accurate diabetes prediction models. The primary aim of this study is to utilize a diverse set of machine learning algorithms to detect the presence of diabetes, particularly in females, at an early stage. By leveraging these methods, this research seeks to provide physicians with valuable tools to identify the disease early, enabling timely interventions and improving patient outcomes. Methods In this study, some state-of-the-art machine learning techniques, such as random forest classifiers with gridsearchCV, XGBoost, NGBoost, Bagging, LightGBM, and AdaBoost classifiers, were employed. These models were chosen as the base layer of our proposed stacked ensemble model because of their high accuracy. Before feeding the data into the models, the dataset was preprocessed to ensure optimal performance and obtain improved results. Results The accuracy achieved in this study was 92.91%, which demonstrates its competitiveness with the existing approaches. Moreover, the utilization of the Shapley additive explanation (SHAP) facilitated the interpretation of machine learning models. Conclusion We anticipate that these findings will be beneficial to healthcare providers, stakeholders, students, and researchers involved in diabetes prediction research and development.</description><subject>Algorithms</subject><subject>Data mining</subject><subject>Developing countries</subject><subject>Diabetes</subject><subject>Endocrinology</subject><subject>Health care industry</subject><subject>India</subject><subject>Machine learning</subject><subject>Medical research</subject><subject>Medicine</subject><subject>Medicine &amp; Public Health</subject><subject>Medicine, Experimental</subject><subject>Metabolic Diseases</subject><subject>Mortality</subject><subject>R&amp;D</subject><subject>Research &amp; development</subject><subject>Research Article</subject><issn>2251-6581</issn><issn>2251-6581</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9ksFu1DAQhiMEolXpC3BAlpAQlxSPncTOCa0qCohKXOBsOc5445LYwU4q8fZ4u6XdRQj7YGvmm9-e0V8UL4FeAKXiXaooo7SkjJcUOIOSPSlOGauhbGoJTw_uJ8V5Sjc0LyGkhOZ5ccJly5ls-GnxZUPSos0P7An6hFM3Ipm0GZxHMqKO3vkt0fMcQw4SGyJZBiRzxN6ZxQVPgiW90x0umF4Uz6weE57fn2fF96sP3y4_lddfP36-3FyXpmbVUupeNxV0XWtNyw3roeN9I7mEitlWQGsM5ViZ3rbQtJJViBliDBC0lbQz_Kx4v9ed127C3qBfoh7VHN2k4y8VtFPHGe8GtQ23CgDaJv8hK7y9V4jh54ppUZNLBsdRewxrUpwKJqHmTGT09V_oTVijz_0pDg2XrM6zfKS2ekTlvA35YbMTVRshuBQCWJ2pi39Qefc4ORM8WpfjRwVvDgoG1OMypDCuu8mnY5DtQRNDShHtwzSAqp1f1N4vKvtF3flFsVz06nCODyV_3JEBvgdSTvktxsfe_yP7G1vcyKk</recordid><startdate>20231122</startdate><enddate>20231122</enddate><creator>Oliullah, Khondokar</creator><creator>Rasel, Mahedi Hasan</creator><creator>Islam, Md. Manzurul</creator><creator>Islam, Md. Reazul</creator><creator>Wadud, Md. Anwar Hussen</creator><creator>Whaiduzzaman, Md</creator><general>Springer International Publishing</general><general>BioMed Central Ltd</general><general>BioMed Central</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>K9.</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0009-0003-7481-5982</orcidid><orcidid>https://orcid.org/0009-0001-1973-2928</orcidid><orcidid>https://orcid.org/0000-0003-2822-0657</orcidid></search><sort><creationdate>20231122</creationdate><title>A stacked ensemble machine learning approach for the prediction of diabetes</title><author>Oliullah, Khondokar ; Rasel, Mahedi Hasan ; Islam, Md. Manzurul ; Islam, Md. Reazul ; Wadud, Md. Anwar Hussen ; Whaiduzzaman, Md</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c524t-ada641bb9fc93c2d1b3d6838142f9719cc03e4cdf9169824eec2d221e1af80bc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Data mining</topic><topic>Developing countries</topic><topic>Diabetes</topic><topic>Endocrinology</topic><topic>Health care industry</topic><topic>India</topic><topic>Machine learning</topic><topic>Medical research</topic><topic>Medicine</topic><topic>Medicine &amp; Public Health</topic><topic>Medicine, Experimental</topic><topic>Metabolic Diseases</topic><topic>Mortality</topic><topic>R&amp;D</topic><topic>Research &amp; development</topic><topic>Research Article</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Oliullah, Khondokar</creatorcontrib><creatorcontrib>Rasel, Mahedi Hasan</creatorcontrib><creatorcontrib>Islam, Md. Manzurul</creatorcontrib><creatorcontrib>Islam, Md. Reazul</creatorcontrib><creatorcontrib>Wadud, Md. Anwar Hussen</creatorcontrib><creatorcontrib>Whaiduzzaman, Md</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of diabetes and metabolic disorders</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Oliullah, Khondokar</au><au>Rasel, Mahedi Hasan</au><au>Islam, Md. Manzurul</au><au>Islam, Md. Reazul</au><au>Wadud, Md. Anwar Hussen</au><au>Whaiduzzaman, Md</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A stacked ensemble machine learning approach for the prediction of diabetes</atitle><jtitle>Journal of diabetes and metabolic disorders</jtitle><stitle>J Diabetes Metab Disord</stitle><addtitle>J Diabetes Metab Disord</addtitle><date>2023-11-22</date><risdate>2023</risdate><volume>23</volume><issue>1</issue><spage>603</spage><epage>617</epage><pages>603-617</pages><issn>2251-6581</issn><eissn>2251-6581</eissn><abstract>Objectives Diabetes has become a leading cause of mortality in both developed and developing countries, impacting a growing number of individuals worldwide. As the prevalence of the disease continues to rise, researchers have diligently worked towards developing accurate diabetes prediction models. The primary aim of this study is to utilize a diverse set of machine learning algorithms to detect the presence of diabetes, particularly in females, at an early stage. By leveraging these methods, this research seeks to provide physicians with valuable tools to identify the disease early, enabling timely interventions and improving patient outcomes. Methods In this study, some state-of-the-art machine learning techniques, such as random forest classifiers with gridsearchCV, XGBoost, NGBoost, Bagging, LightGBM, and AdaBoost classifiers, were employed. These models were chosen as the base layer of our proposed stacked ensemble model because of their high accuracy. Before feeding the data into the models, the dataset was preprocessed to ensure optimal performance and obtain improved results. Results The accuracy achieved in this study was 92.91%, which demonstrates its competitiveness with the existing approaches. Moreover, the utilization of the Shapley additive explanation (SHAP) facilitated the interpretation of machine learning models. Conclusion We anticipate that these findings will be beneficial to healthcare providers, stakeholders, students, and researchers involved in diabetes prediction research and development.</abstract><cop>Cham</cop><pub>Springer International Publishing</pub><pmid>38932863</pmid><doi>10.1007/s40200-023-01321-2</doi><tpages>15</tpages><orcidid>https://orcid.org/0009-0003-7481-5982</orcidid><orcidid>https://orcid.org/0009-0001-1973-2928</orcidid><orcidid>https://orcid.org/0000-0003-2822-0657</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2251-6581
ispartof Journal of diabetes and metabolic disorders, 2023-11, Vol.23 (1), p.603-617
issn 2251-6581
2251-6581
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11196524
source Springer Nature - Complete Springer Journals; PubMed Central
subjects Algorithms
Data mining
Developing countries
Diabetes
Endocrinology
Health care industry
India
Machine learning
Medical research
Medicine
Medicine & Public Health
Medicine, Experimental
Metabolic Diseases
Mortality
R&D
Research & development
Research Article
title A stacked ensemble machine learning approach for the prediction of diabetes
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T09%3A39%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20stacked%20ensemble%20machine%20learning%20approach%20for%20the%20prediction%20of%20diabetes&rft.jtitle=Journal%20of%20diabetes%20and%20metabolic%20disorders&rft.au=Oliullah,%20Khondokar&rft.date=2023-11-22&rft.volume=23&rft.issue=1&rft.spage=603&rft.epage=617&rft.pages=603-617&rft.issn=2251-6581&rft.eissn=2251-6581&rft_id=info:doi/10.1007/s40200-023-01321-2&rft_dat=%3Cgale_pubme%3EA773877125%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3163825893&rft_id=info:pmid/38932863&rft_galeid=A773877125&rfr_iscdi=true