The analysis of the effects of acute rheumatic fever in childhood on cardiac disease with data mining
•ARF is a common disease in many countries as well as in Turkey.•This study is the first study in Turkey which emphasises ARF and data mining analyses together.•201 records of ARF patients (children) and 54 attributes were used.•Different classification algorithms have been used to analyse ARF data...
Gespeichert in:
Veröffentlicht in: | International journal of medical informatics (Shannon, Ireland) Ireland), 2019-03, Vol.123, p.68-75 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •ARF is a common disease in many countries as well as in Turkey.•This study is the first study in Turkey which emphasises ARF and data mining analyses together.•201 records of ARF patients (children) and 54 attributes were used.•Different classification algorithms have been used to analyse ARF data set. (Naive Bayes classifier, CART, C4.5, C5.0, C5.0 (boosted), random forest algorithm). The best result is obtained by using CART by dividing the data set as 80% training, 20% testing with hold-out method.•The results such as decision tree obtained from CART, can be useful for doctors.
Acute rheumatic fever (ARF) is an important disease that is frequently seen in Turkey, it is necessary to develop solutions to cure the disease. It is believed that new data analysis methods may be applied to this disease, and this may be useful to discover previously unrecognized patterns. Data mining of existing records and data repositories may improve knowledge on the diagnosis and management of ARF. In this regard, we planned to make a contribution to the development of new solutions by approaching the problem from a different standpoint.
The aim of this study is to analyse the effects of ARF undergone during childhood on the basis of cardiac diseases by using data mining methods.
Classification methods of data mining were used, and experiments were conducted on five algorithms. The records of the patients diagnosed with ARF were analysed by setting models with naive Bayes classifier, decision trees (CART, C4.5, C5.0, C5.0 boosted) and random forest algorithms. The performances of the algorithms that were derived were then compared. Among model performance evaluation techniques, the hold-out, cross-validation and bootstrap methods were tested in diverse ways in an applied manner. Within the scope of the research, the dataset comprising records of 297 patients was utilised in cooperation with İstanbul Medeniyet University Göztepe Training and Research Hospital’s Pediatric Cardiology Clinic (İstanbul Medeniyet Üniversitesi Göztepe Eğitim ve Araştırma Hastanesi Çocuk Kardiyolojisi Kliniği). Data analysis was carried out with the data of the remaining 201 patients following pre-processing.
The results that were obtained from different algorithms were compared based on the model performance evaluation criteria. The best result was shown under the CART model by using the hold-out technique (80% training, 20% testing). According to this model, the importance values of the predicti |
---|---|
ISSN: | 1386-5056 1872-8243 |
DOI: | 10.1016/j.ijmedinf.2018.12.009 |