Bagged based ensemble model to predict thyroid disorder using linear discriminant analysis with SMOTE

Introduction Machine learning (ML) methods have performed a noteworthy task in the prediction of medical conditions in recent years. Implementation of these approaches results in better diagnostic procedures. Methods In the current study, a bagged based ensemble model of linear discriminant analysis...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Research on Biomedical Engineering 2023-09, Vol.39 (3), p.733-746
Hauptverfasser: Kour, Haneet, Singh, Bhupat, Gupta, Nitin, Manhas, Jatinder, Sharma, Vinod
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Introduction Machine learning (ML) methods have performed a noteworthy task in the prediction of medical conditions in recent years. Implementation of these approaches results in better diagnostic procedures. Methods In the current study, a bagged based ensemble model of linear discriminant analysis (LDA) with SMOTE strategy has been proposed for the identification of thyroid disorders. This ensemble model used bagging methodology to combine 05 conventional LDA models as base learners with majority voting approach. This proposed ensemble model consists of four steps: The first step generates bootstrap samples using random sampling with replacement; the second step applies SMOTE (synthetic minority oversampling technique) over each generated bootstrap sample; the third one implements LDA classifier on the output of the second step; and the fourth step performs final prediction for new observations by aggregating all the trained LDA classifiers using majority voting approach. In this study, two most common thyroid disorders were taken, namely, hyperthyroidism and hypothyroidism. Results The experimentation was performed on primary dataset as well as secondary dataset of thyroid disorder. The primary dataset containing 1092 records and the secondary dataset with 7200 records were undertaken for analysis. The performance of the proposed approach was evaluated on four performance metrics, i.e., accuracy, recall, precision, and f -score. The experimental results predicted the accuracy of 85.45% and 82.71% for primary data and secondary data, respectively. The proposed model was also compared with conventional ML classifiers for performance evaluation. Conclusion The proposed approach can predict thyroid disorder with good efficiency as it enhanced the accuracy of the classic LDA model for thyroid disease diagnosis from 69.55 to 85.45% in case of primary dataset and 75.28 to 82.71% in case of secondary dataset.
ISSN:2446-4740
2446-4740
DOI:10.1007/s42600-023-00307-6