Oversampled-Based Approach to Overcome Imbalance Data in the Classification of Apple Leaf Disease with SMOTE

Research on the detection of apple leaf disease has been developed. Various methods have been carried out to detect apple leaf disease, one of which is by processing digital images. In this study, the author proposes the Convolutional Neural Network (CNN) algorithm as a feature extractor and classif...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Technium: Romanian Journal of Applied Sciences and Technology 2023-10, Vol.16, p.112-117
Hauptverfasser: Eva Y Puspaningrum, Yisti Vita Via, Chilyatun Nisa, Hendra Maulana, Wahyu S.J.Saputra
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Research on the detection of apple leaf disease has been developed. Various methods have been carried out to detect apple leaf disease, one of which is by processing digital images. In this study, the author proposes the Convolutional Neural Network (CNN) algorithm as a feature extractor and classifier of apple leaf images. CNN was chosen because it can apply learning and classification effective and automated image features than traditional feature extraction methods. The dataset used is Plant Pathology 2020 - FGV C7. In this dataset, it was found that the image size varies greatly from the entire dataset or often referred to as data imbalance. In this study, the oversampling technique was successfully applied to handle the uneven distribution of data (imbalanced) and achieved a good evaluation result. The oversampling approach method used is Synthetic Minority Oversampling Technique (SMOTE). The number of imbalanced images is carried out by SMOTE pre-processing to produce balanced data. The CNN algorithm is trained on training data and performance testing on test data with a ratio of 70:30 of the total data. The learning model on the network structure can achieve an accuracy of 92% with data that has been oversampled.
ISSN:2668-778X
2668-778X
DOI:10.47577/technium.v16i.9968