Bagged based ensemble model to predict thyroid disorder using linear discriminant analysis with SMOTE
Introduction Machine learning (ML) methods have performed a noteworthy task in the prediction of medical conditions in recent years. Implementation of these approaches results in better diagnostic procedures. Methods In the current study, a bagged based ensemble model of linear discriminant analysis...
Gespeichert in:
Veröffentlicht in: | Research on Biomedical Engineering 2023-09, Vol.39 (3), p.733-746 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Introduction
Machine learning (ML) methods have performed a noteworthy task in the prediction of medical conditions in recent years. Implementation of these approaches results in better diagnostic procedures.
Methods
In the current study, a bagged based ensemble model of linear discriminant analysis (LDA) with SMOTE strategy has been proposed for the identification of thyroid disorders. This ensemble model used bagging methodology to combine 05 conventional LDA models as base learners with majority voting approach. This proposed ensemble model consists of four steps: The first step generates bootstrap samples using random sampling with replacement; the second step applies SMOTE (synthetic minority oversampling technique) over each generated bootstrap sample; the third one implements LDA classifier on the output of the second step; and the fourth step performs final prediction for new observations by aggregating all the trained LDA classifiers using majority voting approach. In this study, two most common thyroid disorders were taken, namely, hyperthyroidism and hypothyroidism.
Results
The experimentation was performed on primary dataset as well as secondary dataset of thyroid disorder. The primary dataset containing 1092 records and the secondary dataset with 7200 records were undertaken for analysis. The performance of the proposed approach was evaluated on four performance metrics, i.e., accuracy, recall, precision, and
f
-score. The experimental results predicted the accuracy of 85.45% and 82.71% for primary data and secondary data, respectively. The proposed model was also compared with conventional ML classifiers for performance evaluation.
Conclusion
The proposed approach can predict thyroid disorder with good efficiency as it enhanced the accuracy of the classic LDA model for thyroid disease diagnosis from 69.55 to 85.45% in case of primary dataset and 75.28 to 82.71% in case of secondary dataset. |
---|---|
ISSN: | 2446-4740 2446-4740 |
DOI: | 10.1007/s42600-023-00307-6 |