XEMLPD: an explainable ensemble machine learning approach for Parkinson disease diagnosis with optimized features

Parkinson's disease (PD) is a progressive neurological disorder that gradually worsens over time, making early diagnosis difficult. Traditionally, diagnosis relies on a neurologist's detailed assessment of the patient's medical history and multiple scans. Recently, artificial intellig...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of speech technology 2024, Vol.27 (4), p.1055-1083
Hauptverfasser: Khanom, Fahmida, Biswas, Shuvo, Uddin, Mohammad Shorif, Mostafiz, Rafid
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Parkinson's disease (PD) is a progressive neurological disorder that gradually worsens over time, making early diagnosis difficult. Traditionally, diagnosis relies on a neurologist's detailed assessment of the patient's medical history and multiple scans. Recently, artificial intelligence (AI)-based computer-aided diagnosis (CAD) systems have demonstrated superior performance by capturing complex, nonlinear patterns in clinical data. However, the opaque nature of many AI models, often referred to as "black box" systems, has raised concerns about their transparency, resulting in hesitation among clinicians to trust their outputs. To address this challenge, we propose an explainable ensemble machine learning framework, XEMLPD, designed to provide both global and local interpretability in PD diagnosis while maintaining high predictive accuracy. Our study utilized two clinical datasets, carefully curated and optimized through a two-step data preprocessing technique that handled outliers and ensured data balance, thereby reducing bias. Several ensemble machine learning (EML) models—boosting, bagging, stacking, and voting—were evaluated, with optimized features selected using techniques such as SelectedKBest, mRMR, PCA, and LDA. Among these, the stacking model combined with LDA feature optimization consistently delivered the highest accuracy. To ensure transparency, we integrated explainable AI methods—SHapley Adaptive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME)—into the stacking model. These methods were applied post-evaluation, ensuring that each prediction is accompanied by a detailed explanation. By offering both global and local interpretability, the XEMLPD framework provides clear insights into the decision-making process of the model. This transparency aids clinicians in developing better treatment strategies and enhances the overall prognosis for PD patients. Additionally, our framework serves as a valuable tool for clinical data scientists in creating more reliable and interpretable CAD systems.
ISSN:1381-2416
1572-8110
DOI:10.1007/s10772-024-10152-2